Optimizing in BQN
In the previous post I describe how I made a collision checking function between multiple line segments and circles. Here is the code:
Dot β +Β΄βΓ
# π©β’nβΏ2βΏ2 π¨β’β¨kβΏ2, kβ©
ColSegmentsCircles β {
[a,b] β 1βπ© β cβΏr β π¨ β r_squared β rβ2
ca β {π© -β1 c}β1 a β cb β {π© -β1 c}β1 b
col β {β¨Β΄β1}β2 r_squaredβ₯β1 DotΛβ1 {π© -β1 c}β1 π©
n β (-βΎβ)ββ½Λ b-a
d_squared β (2βΛ n {π¨ Dotβ1 π©}Λ ca ) Γ· DotΛΛ b-a
col β© col β¨Λ β¨Β΄Λ (0 β₯ ca DotΛΛ cb) β§ {r_squared β₯Β¨ π©}Λ d_squared
}
This code will be used as the base for the path planners and will be the computation bottleneck. As a first draft I am pretty happy with the result but it is not the best I could do. Especially, there are many places in my code where I have to reorder the data axis to perform computations such as this line
col β {β¨Β΄β1}β2 r_squaredβ₯β1 DotΛβ1 {π© -β1 c}β1 π©
I asked for advices on discord and Marshall helped me rewrite the code. His big insight was to change the data representation from π©β’nβΏ2βΏ2 π¨β’β¨kβΏ2, kβ©
to π©β’2βΏnβΏ2 π¨β’β¨2βΏk, kβ©
. The first axis becomes the $(x,y)$ information. Here is the code with the new representation.
# π©β’2βΏnβΏ2 π¨β’β¨2βΏk, kβ©
ColSegmentsCircles2 β {
cβΏr β π¨ β cacb β π© -βΛ c β ca β ββ2 cacb β ab β -Λβ1 π© β r_sq β ΓΛr
n β (-βΎβ)ββ½ ab
d_sq β (ΓΛ +Λ n Γ ca) Γ· +ΛΓΛ ab
col β (β¨Β΄ββ₯Λ (r_sqβΈβ₯)β1 +ΛΓΛΛ cacb) β¨ β¨ΛΛ(0 β₯ +Λ ca Γ 1ββ2 cacb) β§ r_sqβΈβ₯β1 d_sq
}
We can immediatly see that there are much less use of the rank operator β
. With the new representation the code is able to use constantly the SIMD instructions.
I benchmarked the two functions and wow! The new code is around 3~10 times faster depending on how much line segments and circles I input. The more we give the better it performs compared to the original function.
However, this code is still suboptimal. We first test if the segment endpoints are inside the circles and then we test the distance of the circles to the segments. If the first test is positive then we should not perform the second test! Dzaima showed me how to use the under operator βΎ
to only apply the second test on my boolean vector. Here is my final code.
# π©β’2βΏnβΏ2 π¨β’β¨2βΏk, kβ©
ColSegmentsCircles3 β {
cβΏr β π¨ β r_sq β ΓΛr
F β {β¨Β΄ββ₯Λ (r_sqβΈβ₯)β1 +ΛΓΛ π© -βΛ c}
G β { cacb β π© -βΛ c β ca β ββ2 cacb β ab β -Λβ1 π©
n β (-βΎβ)ββ½ ab
d_sq β (ΓΛ +Λ n Γ ca) Γ· +ΛΓΛ ab
β¨ΛΛ(0 β₯ +Λ ca Γ 1ββ2 cacb) β§ r_sqβΈβ₯β1 d_sq}
m β Β¬F π© β col β (G m /β2 π©)βΎ(mβΈ/) (1ββ’π©) β₯ 1
}
This code performs around 2~3 times faster than the previous one and around 4~25 times faster than my original code!
Optimizing in BQN seems to be first and foremost thinking about the data. Instead of randomly ordering the information it is better to think about which operation will be applied to select which shape we should use.