Graphics on the web and beyond with WebGPU

Or the like…

Getting started with WebGPU

Where do we start. The specification is still underway, resources are scarce. Nevertheless, going through the website helps us getting an idea of how streamlined the API is: https://gpuweb.github.io/gpuweb/.


Shading language

Since WebGPU is the wannabe graphic api for the web (and native to some extent), it needs to work everywhere. Portability is key and the best way to achieve it today is via Khronos’ SPIR-V.

After a first proposal lead by Apple — WHSL — and some opposition from Apple to work with Khronos for “private legal” reasons, a consensus seemed to have been agreed upon in February 2020: WGSL.

WebGPU Shader Language (WGSL)

Started of a proposal from Google named Tint, its main characteristic is that “it can be faithfully and simply converted to and from SPIR-V ‘Shader’ modules that only use GPU features available in WebGPU”.

Tint Proposal: “WebGPU ingests a new text based language which is bijective to SPIR-V.”

This goes in the direction of the Web: it should work the same everywhere, no matter what platform.

Text based allows us poor developers to dynamically generate our shaders at runtime without the need for embedding a JavaScript to SPIR-V compiler.

New language also means we’ll need to rewrite all of our shaders, yay. Hopefully, with the help of good tooling developed by the WebGPU Community Group and thanks to the similarities between HLSL, GLSL and WGSL (maybe), this task should become trivial (finger crossed).

Want a peek?

#version 300 es
out vec4 out_FragColor;

void main() {
	out_FragColor = vec4(0.4, 0.4, 0.8, 1.0);
}

should translates to

[[location 0]] var<out> gl_FragColor : vec4<f32>;

fn main() -> void {
	gl_FragColor = vec4<f32>(0.4, 0.4, 0.8, 1.0);
	return;
}
entry_point fragment = main;

The full wip spec is here: https://gpuweb.github.io/gpuweb/wgsl.html

GLSL

But… I got used to GLSL. Although the specification clearly states “WebGPU is not related to WebGL and does not explicitly target OpenGL ES”, coming from WebGL, we’ll want to keep our shaders understandable and use what we already know. Good news everyone, we can with a GLSL-to-SPIR-V compiler for the Web (a build of glslang with WebAssembly):

const glslang = await (
	await import("@webgpu/glslang@0.0.13/dist/web-devel/glslang.js")
).default();

// ...

device.createShaderModule({
	code: glslang.compileGLSL(mySource, ‘vertex’),
});

And without realising it, this it the first step in understanding WebGPU’s API.


Designing the right abstraction

High-level libraries like Three.js and Babylon.js are a good way to get into 3D graphics on the web but they feel much more similar to using 3D softwares (Blender, C4D…) than actual interaction on GPUs. It is good when we want something on the screen quickly, but we can end up distancing ourselves from how all of it really works. In certain cases though, we might need to understand more to tackle various problems or purely out of curiosity.

It is true that looking at what it takes to draw a triangle in Vulkan, one might get scared of getting closer to the metal. WebGPU is one level above though and feels graspable due notably to its descriptive nature. If we look at the spec shared above, we can just go through the api, look at the descriptors (the formatted objects we send to set the state of the GPU) and search for any term we don’t understand. The rest is pure data and buffer binding.

https://dmnsgn.github.io/dgel/?id=instancing

I have had a try a putting all the pieces together in a coherent way. I’ll attempt to describe the different building blocks following a simple example:

https://dmnsgn.github.io/dgel/?id=cube

This example has been tested in Chrome Canary (after enabling chrome://flags/#enable-unsafe-webgpu flag) and its source code is available here: https://dmnsgn.github.io/dgel/examples/cube.js.

Setup

Similar to WebGL, we’ll need to grab a canvas context:

const context = canvas.getContext("gpupresent");

The type is gpupresent at the time of writing. Its naming is based on xrpresentwhich is not really a thing anymore–so it might or might not get simplified in the future.

Once that’s done, we need to request two core objects from the API: an adapter and a device.

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

I can’t explain it in any other way than the spec so here it is:

An adapter represents an implementation of WebGPU on the system. Each adapter identifies both an instance of a hardware accelerator (e.g. GPU or CPU) and an instance of a browser’s implementation of WebGPU on top of that accelerator.

Similar to a webgl or webgl2 context, the adapter request is where we can ask for a “low-power” or “high-performance” powerPreference.

We can think of the device as the shared context on which the GPU commands will be executed. The actual context (obtained from canvas.getContext) will only be used to call configureSwapChain:

const swapChain = context.configureSwapChain({
	device,
	format: "bgra8unorm",
	usage: GPUTextureUsage.OUTPUT_ATTACHMENT,
});

Note: in a Web use case, the snippets above will most likely always be abstracted as we’ll use the current browser adapter, a single device and a unique swap chain throughout a webpage. I have created a Context class in dgel for that purpose.

Commands and Passes

This context needs to be fed commands to actually do anything. With a simple callback logic via a render method attached to the Context instance, we define a command encoder on the device which will allow us to send commands to the GPU:

this.commandEncoder = State.device.createCommandEncoder();
// Submit commands
cb();
State.device.defaultQueue.submit([this.commandEncoder.finish()]);
this.commandEncoder = null;

A command will usually look like a data structure with a Pipeline and/or Pass attached to it.

Clear pass

Before drawing anything, one might want to clear the back buffer with a color. Here’s an example how to do that. First, we need to describe how the command works. We create a command with a Pass object of type “render” (only other type at the moment is “compute”). This command first sets a color attachment and then a depth attachment (think gl.clearColor and gl.clearDepth).

const clearCommand = new Command({
	pass: new Pass(
		"render",
		[new Attachment({ r: 0.07, g: 0.07, b: 0.07, a: 1 })],
		new Attachment(1)
	),
});

Note: using a dictionary object with rgba feels a bit strange as opposed to having a simple array but at least it is pretty explicit.

Great, we now have a unicolor grey background, let’s try to draw a basic geometry to start with. Brace yourselves.

Draw pass

A typical geometry needs indices and vertices. As mentioned before, it is just data so we’ll need to create Buffer objects and add data to them in the form of TypedArray (here Float32Array for the positions/normals/uvs and Uint32Array for the indices):

import Geometries from "primitive-geometry";

// ...

const geometry = Geometries.cube();
const geometryVertexBuffer = new Buffer();
const geometryIndicesBuffer = new Buffer();

geometryVertexBuffer.vertexBuffer(
	new Float32Array(
		geometry.positions
			.map((_, index) => [
				geometry.positions[index],
				geometry.normals[index],
				geometry.uvs[index],
			])
			.flat()
			.flat()
	)
);
const indices = new Uint32Array(geometry.cells.flat());
geometryIndicesBuffer.indexBuffer(indices);

Note: the above is a bit messy as primitive-geometry creates geometries with 2d arrays that we need to flatten.

Additionally, we need to pass a vertex count for our geometry and/or a number of instances to draw.

Our second command to draw a geometry will look like this:

const drawGeometryCommand = new Command({
	pipeline,
	bindGroups: [systemUniformBindGroup, meshUniformBindGroup],
	vertexBuffers: [geometryVertexBuffer],
	indexBuffer: geometryIndicesBuffer,
	count: indices.length,
});

This will surely looks familiar to the regl and pex-context folks. But let’s see what are these two properties that I just sneaked in the command descriptor:

Bing Groups and their layouts

WebGPU is very descriptive. One might argue it makes things too verbose but in the end, when we are clearly aware of what we are sending to the GPU, we tend to be more resource conscious; our application are snappier and more accessible, our users happier. It feels refreshing compared to calling gl.drawArrays or gl.drawElements over and over in WebGL.

Layouts

Bind groups needs to be described with a layout (another reusable component). For instance, we want to share cameras matrices among all our geometries and programs. We first need to describes how our data structure will look like and in which shader stage it will be accessed (vertex and/or fragment, and/or compute). Adding uniforms with a type helps us abstract a lot of the computation by implying the byte sizes.

const systemBindGroupLayout = new BindGroupLayout([
	{
		type: "uniform-buffer",
		visibility: GPUShaderStage.VERTEX,
		name: "System",
		uniforms: [
			new Uniform("projectionMatrix", "mat4"),
			new Uniform("viewMatrix", "mat4"),
		],
	},
]);

Note: there’s more work to be done here as the types are based on GLSL and not on WGSL yet and buffer alignment is hard.

We have just described an interface for our Bind Group as containing a single entry: a uniform buffer available in the vertex shader.

Resources: buffers

We can now create a uniform Buffer where we will store and update the camera matrices. The size of this Buffer can be interpolated directly from the layout:

const systemUniformsBuffer = new Buffer();
systemUniformsBuffer.uniformBuffer(systemBindGroupLayout.getBindGroupSize());

And that’s about all we need to create a Bind Group:

A GPUBindGroup defines a set of resources to be bound together in a group and how the resources are used in shader stages.

const systemUniformBindGroup = new BindGroup({
	layout: systemBindGroupLayout.gpuBindGroupLayout,
	resources: [
		{
			buffer: systemUniformsBuffer.gpuBuffer,
			offset: 0,
			size: systemBindGroupLayout.getBindGroupSize(),
		},
	],
});

This Bind Group can now be sent to different Commands. In this case, it is a great candidate for system wide uniforms, close to what a Vertex Array Object (VAO) does in WebGL.

Resources: sampler, textures

We also use them to store individual objects data, for instance matrices of a mesh next to a sampler and a texture:

// Create the layout first
const meshBindGroupLayout = new BindGroupLayout([
	{
		type: "uniform-buffer",
		visibility: GPUShaderStage.VERTEX,
		name: "Mesh",
		uniforms: [new Uniform("modelMatrix", "mat4")],
	},
	{
		type: "sampler",
		visibility: GPUShaderStage.FRAGMENT,
		name: "uSampler",
	},
	{
		type: "sampled-texture",
		visibility: GPUShaderStage.FRAGMENT,
		name: "uTexture",
		dimension: "2d",
	},
]);

// Create a uniform Buffer for mesh specific data
const meshUniformsBuffer = new Buffer();
meshUniformsBuffer.uniformBuffer(meshBindGroupLayout.getBindGroupSize());

// Create the sampler and texture from an image element
const uvSampler = new Sampler();

const uvImage = document.createElement("img");
uvImage.src = "assets/uv.jpg";
await uvImage.decode();

const uvTexture = new Texture(null, uvImage);

// Create the bind group with the above resources
const meshUniformBindGroup = new BindGroup({
	layout: meshBindGroupLayout.gpuBindGroupLayout,
	resources: [
		{
			buffer: meshUniformsBuffer.gpuBuffer,
			offset: 0,
			size: meshBindGroupLayout.getBindGroupSize(),
		},
		uvSampler.gpuSampler,
		uvTexture.gpuTexture.createView(),
	],
});

As a reminder, we’re trying to build a command to send to the GPU in order to draw a mesh. We now have most of the data needed to draw this mesh:

const drawGeometryCommand = new Command({
	pipeline,
	bindGroups: [systemUniformBindGroup, meshUniformBindGroup],
	vertexBuffers: [geometryVertexBuffer],
	indexBuffer: geometryIndicesBuffer,
	count: indices.length,
});

Now how do we actually render something on screen and where are the shaders? Enters the Pipeline.

Pipeline and Program

A Pipeline for us will merely be a higher structure above a good old WebGL Program. Similarly to BindGroups, it is defined by a layout. On top of that, we’ll define shader code — in GLSL for now — and pass our geometry’s attributes.

Layout, uniforms and attributes

As seen above when generating the data, our geometry is defined by its positions, normals and uvs (in this order in the buffer). So let’s add an array defining our input variables with their types:

const pipeline = new Pipeline({
	bindGroupLayouts: [systemBindGroupLayout, meshBindGroupLayout],
	ins: [
		new Attribute("position", "vec3"),
		new Attribute("normal", "vec3"),
		new Attribute("uv", "vec2"),
	],
	// ...
});

Our pipeline needs to be aware of bindGroupLayouts for two reasons:

Note: There’s surely more abstraction to be done here, but to handle a broad range of cases, that’s the bare minimum.

Great, our vertex shader is now able to read these attributes. But we know that we’ll want these available in the fragment shader stage so let’s add some outs attributes:

const pipeline = new Pipeline({
	bindGroupLayouts: [systemBindGroupLayout, meshBindGroupLayout],
	ins: [
		new Attribute("position", "vec3"),
		new Attribute("normal", "vec3"),
		new Attribute("uv", "vec2"),
	],
	outs: [new Attribute("vNormal", "vec3"), new Attribute("vUv", "vec2")],
	// ...
});

Program and Shaders

To simplify things, our Pipeline object can receive a vertex and fragment property that will both have access to the ins and outs previously defined.

Since we’ve defined the systemBindGroupLayout and meshBindGroupLayout, we’ll also receive:

We access these uniforms blocks via their name with the first letter lower cased as a convention.

With this in mind, it is easy to just use all of the attributes in our vertex and fragment shader:

const pipeline = new Pipeline({
	bindGroupLayouts: [systemBindGroupLayout, meshBindGroupLayout],
	ins: [
		new Attribute("position", "vec3"),
		new Attribute("normal", "vec3"),
		new Attribute("uv", "vec2"),
	],
	outs: [new Attribute("vNormal", "vec3"), new Attribute("vUv", "vec2")],
	vertex: /* glsl */ `
    void main() {
      vNormal = normal;
      vUv = uv;

      gl_Position =
        system.projectionMatrix *
        system.viewMatrix *
        mesh.modelMatrix *
        vec4(position, 1.0);
    }`,
	fragment: /* glsl */ `
    void main() {
      // outColor = vec4(vNormal * 0.5 + 0.5, 1.0);
      outColor = texture(sampler2D(uTexture, uSampler), vUv);
    }`,
});

The vertex shader will simply pass along our vertex normal and uv to the fragment, and define our gl_Position output from the Model View Projection matrices and the mesh current vertex position.

The fragment has one line commented that just displays the normals (useful for debugging). The other line is using our sampler and texture uniform resources as well as the uv to texture our geometry.

dgel: cube example showing normals (left) and texture (right)

Pipeline states

We have seen how to abstract the shader part of the pipeline, but similarly to WebGL we can set the blending mode, the primitive topology (triangle list, lines…) and the depth/stencil state.

Structurally, the pipeline consists of a sequence of programmable stages (shaders) and fixed-function states, such as the blending modes.

Frame loop

So what’s left to do? Rotating the cube via its matrix, setting the uniforms and submitting our commands:

requestAnimationFrame(function frame() {
	// Update our clock object
	clock.getDelta();

	// Set rotation according to current time
	mat4.identity(modelMatrix);
	mat4.rotateY(modelMatrix, modelMatrix, clock.time);

	// Update UBOs
	systemUniformsBuffer.setSubData(0, camera.projectionMatrix);
	systemUniformsBuffer.setSubData(4 * 16, camera.viewMatrix);
	meshUniformsBuffer.setSubData(0, modelMatrix);

	// Call render:
	// - clear first
	// - draw geometry as sub command
	context.render(() => {
		context.submit(clearCommand, () => {
			context.submit(drawGeometryCommand);
		});
	});
	requestAnimationFrame(frame);
});

Conclusion

Phew, that was a long road to follow in order to draw a cube but hopefully you see the potential here. dgel is just an initial exploration in defining an engine core abstraction with WebGPU.

The API–although not final–is already pleasant to work with and closer to moderns APIs like Vulkan, DirectX and Metal.

Another layer of abstraction is definitely needed and should hide the gpu objects, meaning we could spend less time defining layouts and more time doing actual graphics.

The source code is available here: https://github.com/dmnsgn/dgel. If time permits, I might write more about abstracting shaders and using compute shader. Until then, thanks for reading.

Links: