jes notes Index Gallery .

Clocks ▼

House ▼

Ideas ▼

Machining ▼

Signed Distance Functions Shaft passers Snap issues

2025-03-17

Last modified: 2025-03-17 12:52:38

< 2025-03-16 2025-03-18 >

Shader compile time
Vision for Isoform
Bauble
Shader compilation

Shader compile time

I think one last thing before I do the interval arithmetic compiler thing is work out why my shaders take so long to compile.

It's taking 1850 ms to compile a shader for a document that is just a single sketch of a square.

If I delete the "branchless in/out test" it goes down to 784 ms, so that obviously has to be the focus of the improvement.

One thing that makes it really hard is that there seems to be some caching of compiled results, so if the compiler works out that it has already compiled an equivalent shader before, it compiles really quickly even though no improvement has been made.

OK, it looks like some cooler stuff is available in GLSL ES 3.00, which I thought wasn't available because I kept getting warnings when I tried to use it, but you just have to put #version 300 es at the top of the shader to enable it!

So now we get:

first-class arrays
integer modulo

And it now compiles my square in 1450 ms, so we've saved 400 ms for free! 620 ms with "branchless in/out test" removed, so still 830 ms spent on compiling that.

The test is:

  // Branchless in/out test
  s *= 1.0 - 2.0 * float((p2d.y >= va.y && p2d.y < vb.y && ds.y > 0.0) ||
                    (p2d.y < va.y && p2d.y >= vb.y && ds.y <= 0.0));

It occurs once per line segment on the polygon, so 4 times for a square.

My best guess is that the boolean logic causes lots of possible branches and it has to somehow compile every combination? Even though they converge again shortly after?

I just tried again but this time moved one of the vertices slightly, and it took 1920 ms to compile, so maybe 1450 ms included some annoying caching, and switching to GLSL ES 3.00 has not helped at all.

Vision for Isoform

CAD kernel

SDF is represented with an arithmetic tree which we turn into GLSL via static single assignment and register allocation, like in Fidget
Evaluated with interval arithmetic, like in Fidget
Rendered with ray marching with binary search for the first collision, like in https://www.shadertoy.com/view/wfjXRG
"tape shortening" like in Fidget, but implemented with flags instead of re-doing register allocation for every evaluation
node properties passed in as uniforms so that they can be edited without recompiling the shader

User interface

Basically just copy FreeCAD. I know people criticise FreeCAD for its user interface but it is mostly fine and I already know it so it suits me.

I want a draggable transform handle. Maybe some sort of scroll wheel input or something for changing things like blend radius, box size, etc. directly in the 3d view.

The main thing is a good 2d sketch editor.

Research areas

how can we make shader compilation more efficient?
how can we target fillets/chamfers to specific edges?
domain deformation of the locations of a Pattern?
other ways of creating and combine field functions?

Bauble

Kevin Lynagh linked me to https://ianthehenry.com/posts/bauble/building-bauble/ which says he is somehow compiling shaders in web workers?

I don't know if he means that "his" compiler is creating GLSL in a web worker, or if the actual shader compilation is in a web worker. If the latter, I didn't know that was possible.

Shader compilation

ChatGPT says you can't compile shaders in a web worker because you need a WebGL context to compile shaders and web workers don't have a way to get access to your WebGL context.

I'm thinking that I must be doing something very wrong in order for a shader for a simple square shape to take 2 seconds to compile.

It still takes 1.8 seconds to compile if I paste the generated source into shadertoy.

If I delete all the unused steps/field/opacity/secondary code paths?

No difference, still 1.8 seconds. Deleting unused functions seems to not change the compile time at all, I think it is quite good at dead code removal.

I've deleted a bunch of other stuff and now it is compiling in only 0.2 seconds. Did I pass some critical size where it can do it more efficiently? Deleting the axis indicator seemed to help. Not sure, it might just be very effective caching?

https://www.shadertoy.com/view/tfBXRc

This compiles super fast ("0.0 secs"), even if I make changes to it that I am pretty sure aren't cached. It includes the complicated polygon function and not much else.

So that proves that the polygon function can be compiled quickly.

So maybe I am thinking that the shader has some critical size at which the compilation suddenly gets much more complicated? But why?

Is there a way to get WebGL to tell me "how large" my compiled shader is?

Apparently WEBGL_get_program_binary extension is not supported on my comp.

OK, what if I add back in the raymarching code? Now it takes 1.3 seconds to compile. And I can make it take 1.3 seconds again by simply changing the camera location (first argument to rayMarch) by some trivial amount.

If I delete the MarchResult struct and have rayMarch() return a vec3 then it only takes 0.5 seconds to compile.

But goes up to 1.3 seconds if I add the early exit back to the rayMarch() function!

OK, so back to Isoform: if I delete the early exit from rayMarch() does it compile faster?

No. Still takes 1.9 seconds.

I guess this is consistent with the idea that it's not any particular piece of code that is making it slow, but rather the overall amount of it. Deleting anything is enough to make it fast.

In shadertoy what if I delete the field visualisation code and only keep the raymarching code?

Yeah, fast again. Even with early exit from rayMarch().

So everything points towards the compilation time gets drastically slower once you go over a critical size, where "size" is probably the compiled binary size of the shader program, but that is not exposed to us.

ChatGPT suggested the compiler may be struggling to allocate registers.

So perhaps the issue with the sketch function is that it does lots of straight-line code in the same scope. Can I move the in/out test into a function maybe?

Doesn't seem to help.

People online keep saying that shader compilation is slow because of loop unrolling, but I think that isn't it. If I reduce MAX_STEPS from 500 to 5, it doesn't appear to change the compilation time.

Should I be rendering the secondary object as a second instance of the shader, drawing on top of the main texture output, instead of inside the main shader? And then I only need to recompile the one object instead of the whole shader!

So that would be a win. It would also reduce the size of the main shader, so that one would (probably?) compile quicker as well.

So we'd make the basic shader code just draw one object, and optionally take a base layer as input, and parameters are:

input texture
colour of shape
whether to draw a background colour
whether to draw the axis indicator

For the main object there is no input texture, the shape is grey and we draw the background. For the secondary object the input texture is the output of the first shader, the shape is red, and we don't draw the background (i.e. leave it transparent).

Is it maybe better to never draw the background in the shader, and instead supply an input texture full of background colour to the first shader?

Empirically, deleting all the secondary rendering code from the shader does not make it compile any faster. So you might still get a benefit from only having to compile a small shader for the secondary node, if the secondary node is simple, but it doesn't reduce the compilation time of the main shader.

< 2025-03-16 2025-03-18 >