A coding impromptu

This post is a rolling collection of algorithms and computational ideas I like, implemented in BQN:

Table of Contents

Extrapolating Perlis' remark1, it's likely that a group of 50 individuals would devise 35 to 40 distinct solutions to even the simplest problem in BQN. Therefore, I will frequently juxtapose my implementations with those of seasoned BQNators2.

Z algorithm

This is a very efficient procedure that finds prefix strings in linear time. The imperative implementation reads:

ZI ← {π•Šs:
  lβ€Ώrβ€Ώz ← 0βš‡0 0β€Ώ0β€Ώs
  z ⊣ {
    v ← r(⊒1⊸+β€’_while_{(𝕩+𝕨)<β‰ s ? =Β΄βŸ¨π•©,𝕩+π•¨βŸ©βŠ‘Β¨<s ; 0}<β—Ά({zβŠ‘Λœπ•©-l}⌊-+1)β€Ώ0)𝕩
    r <β—Ά@β€Ώ{π•Š: l↩𝕩-v+1 β‹„ r↩𝕩} 𝕩+v-1
    z v⌾(π•©βŠΈβŠ‘)↩
  }Β¨ ↕≠s
}
ZI "abacabadabacaba"
⟨ 15 0 1 0 3 0 1 0 7 0 1 0 3 0 1 ⟩

Two algorithmic improvements can be made here, namely only iterate over indices where the character found is equal to the first character, and only search to extend the count if it goes up to the end of r:

ZFun ← {π•Šs:
  CountEq ← { 1⊸+β€’_while_((≠𝕨)βŠΈβ‰€β—ΆβŸ¨βŠ‘βŸœπ•¨β‰‘βŠ‘βŸœπ•©,0⟩) 0 }
  l←r←0 β‹„ Ulr ← {(rβŒˆβ†©π•¨+𝕩)>r ? l↩𝕨 β‹„ 𝕩; 𝕩}
  SearchEq ← ⊣ Ulr ⊒ + + CountEqβ—‹(β†“βŸœs) ⊒
  Set ← {iπ•Šπ•©: ((r-i) (i SearchEq 0⌈⊣)βŸβ‰€ (i-l)βŠ‘π•©)⌾(iβŠΈβŠ‘) 𝕩 }
  (⌽1↓/βŠ‘βŠΈ=s) Set´˜ β†‘Λœβ‰ s
}

I came up with two array versions, with quadratic and cubic time complexities respectively:

ZAQ ← Β―1↓↓(+´·∧`⊣=β‰ βŠΈβ†‘)Β¨<
ZAC ← (+´∧`)Β¨<=β†•βˆ˜β‰ {Β«βŸπ•¨π•©}⌜<
(ZAQ≑ZAC)β—Ά@β€ΏZAC "abacabadabacaba"
⟨ 15 0 1 0 3 0 1 0 7 0 1 0 3 0 1 ⟩

With further refinements, the earlier solutions can be transformed into:

ZAQβ€ΏZAC ← {(+´∧`)¨𝕏}Β¨ βŸ¨β‰ β†‘β†“=βŒ½βˆ˜β†‘, <=«⍟(β†•βˆ˜β‰ )⟩

Longest increasing sub-sequence

This problem can be solved in \(O(n\log n)\) using dynamic programming. Here is an imperative implementation which is quadratic, but can be optimized:

LISI ← {
  kβ€Ώdp ← Β―1β€Ώ(βˆžΒ¨π•©)
  {i ← βˆ§Β΄β—Ά(βŠ‘βŠβŸœ0)β€Ώ{π•Š:k+↩1} dp<𝕩 β‹„ dp π•©βŒΎ(iβŠΈβŠ‘)↩}Β¨ 𝕩
  +´∞>dp
}
LISIΒ¨ ⟨0β€Ώ1β€Ώ0β€Ώ3β€Ώ2β€Ώ3, 10β€Ώ9β€Ώ2β€Ώ5β€Ώ3β€Ώ7β€Ώ101β€Ώ18, 7β€Ώ7β€Ώ7β€Ώ7β€Ώ7⟩
⟨ 4 4 1 ⟩

A more elegant array solution is:

LISA ← +Β΄βˆžβ‰ βˆžΒ¨{π•¨βŒΎ((βŠ‘π•©β‹π•¨-1)βŠΈβŠ‘)𝕩}´⌽
LISAΒ¨ ⟨0β€Ώ1β€Ώ0β€Ώ3β€Ώ2β€Ώ3, 10β€Ώ9β€Ώ2β€Ώ5β€Ώ3β€Ώ7β€Ώ101β€Ώ18, 7β€Ώ7β€Ώ7β€Ώ7β€Ώ7⟩
⟨ 4 4 1 ⟩

Let's )explain this optimized version, so we can truly appreciate its beauty:

 +Β΄βˆžβ‰ βˆžΒ¨{π•¨βŒΎ((βŠ‘π•©β‹π•¨-1)βŠΈβŠ‘)𝕩}´⌽ 
 β”‚ β”‚ β”‚ β”‚β”‚    β”‚ β”‚ β”‚  β”‚ β”‚  β”‚ 
 β”‚ β”‚ β”‚ {┼────┼─┼─┼──┼─┼─´│ 
 β”‚ β”‚ ∞¨ β”‚    β”‚ β”‚ β”‚  β”‚ β”‚ β”‚β”‚ 
 β”‚ β”‚  β””β”€β”Όβ”€β”€β”€β”€β”Όβ”€β”Όβ”€β”Όβ”€β”€β”Όβ”€β”Όβ”€β”ΌβŒ½ 
 β”‚ βˆžβ‰ β”€β”€β”€β”Όβ”€β”€β”€β”€β”Όβ”€β”Όβ”€β”Όβ”€β”€β”Όβ”€β”Όβ”€β”˜  
 +Β΄ β”‚   β”‚    β”‚ β”‚ β”‚  β”‚ β”‚    
  β””β”€β”˜   β”‚    β”‚ β”‚ β”‚  β”‚ β”‚    
        β”‚    β”‚ 𝕨-1  β”‚ β”‚    
        β”‚    π•©β‹β”€β”˜   β”‚ β”‚    
        β”‚   βŠ‘β”€β”˜     β”‚ β”‚    
        β”‚   β””β”€β”€β”€β”€β”€β”€βŠΈβŠ‘ β”‚    
        π•¨βŒΎβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚    
         β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€π•©    
β•Άβ”€β”€β”€β”€β”€β”€β”€β”€β”˜

The full expression is structured as a two-train: we sum all finite entries from the result of the rightmost three-train. The three-train is a right fold over the reversed input, with an initial array of ∞ and the same length as the input. In each step of the fold, we modify the right argument using under: we perform a binary search with strict comparison to find where the next element should go. The element is either placed at the end of the unfilled region, or it replaces the first element that is greater than 𝕨. Since BQN uses a based array model, we pick the enclosed atom from this operation's result. So it goes3.

N-queens problem

This problem is the archetypal example of backtracking. Initially, I tried to solve it using a function to place the queens in the full board, hoping that it would lead to a more array oriented solution:

8 {((∨⌜´0⊸=)∨(0=-⌜´)∨0=+⌜´) 𝕩-Β¨<↕𝕨} 2β€Ώ3
β”Œβ”€                 
β•΅ 0 1 0 1 0 1 0 0  
  0 0 1 1 1 0 0 0  
  1 1 1 1 1 1 1 1  
  0 0 1 1 1 0 0 0  
  0 1 0 1 0 1 0 0  
  1 0 0 1 0 0 1 0  
  0 0 0 1 0 0 0 1  
  0 0 0 1 0 0 0 0  
                  β”˜

This resulted in a more complicated algorithm, so I decided to go for the classical Wirth implementation:

NQ ← {π•Šn:
  Vβ€ΏP ← {βŠ£π•(⊒∾-β‹ˆ+)´∘⊒}Β¨ βŸ¨βˆ¨Β΄βŠ‘Β¨Λœ, {1⌾(π•©βŠΈβŠ‘)𝕨}¨⟩
  {n≠𝕩 ? +Β΄(𝕨V⊒)β—ΆβŸ¨(𝕩+1)π•ŠΛœπ•¨P⊒,0⟩∘(π•©β‹ˆβŠ’)Β¨ ↕n ; 1
  }˜´ (0β‹ˆ0Γ—Β·β†•Β¨βŠ’βˆΎΒ·β‹ˆΛœ+˜)n 
}

Which nicely compares with the OEIS sequence:

a000170 ← 1β€Ώ0β€Ώ0β€Ώ2β€Ώ10β€Ώ4β€Ώ40β€Ώ92
a000170 ≑ NQΒ¨ 1+↕8
1

And of course, in the implementation above I could have used a single array instead of three, but I find the resulting validation and position functions very aesthetic the way they are.

Majority element

The Boyer–Moore algorithm allows for finding the majority element (element that appears more than βŒŠπ•©Γ·2 times in the array) in linear time. If such element exists, then it is equal to the mode of the data, and for this task we have a nice array solution. The original implementation could be expressed as:

BM ← {v←0 β‹„ Iβ†βŠ’βŠ£=β—Ά{π•Š:v+↩1}β€Ώ{π•Š:v-↩1} β‹„ 0{π•Š:v=0}β—ΆβŸ¨I,IΛœβŠ£βŸ©Β΄π•©}
BM 6β€Ώ1β€Ώ3β€Ώ1β€Ώ3β€Ώ3β€Ώ4β€Ώ3β€Ώ3β€Ώ5
3

The previous fold tracks the majority element as state, a more elegant approach maintains the number of votes:

BM ← {e←@ β‹„ 0{𝕩=0 ? e↩𝕨⋄1 ; 𝕩+Β―1⋆e≒𝕨}´𝕩 β‹„ e}
BM 6β€Ώ1β€Ώ3β€Ώ1β€Ώ3β€Ώ3β€Ώ4β€Ώ3β€Ώ3β€Ώ5
3

An identity on the naturals

Some time ago, while working on performance optimization of linear algebra operations with Boolean arrays, I encountered an interesting summation property for an array \(a\) of length \(n\):

\begin{equation*} \sum_{i | a_i \neq 0} \sum_{j=i+1} f_j = \sum_{j=0} f_j \sum_{i < j | a_i \neq 0} 1 \end{equation*}

It turns out that the RHS can be elegantly transformed into a scan, giving rise to a beautiful identity that applies to all natural numbers, not just Booleans as I initially thought:

(+`≑·+Β΄/β‰€βŸœ<βŠ’Λœ) β€’rand.Range˜ 1e3
1

This identity holds because βŠ’Λœ represents the indices i of the list, and since +Β΄(/𝕩)=i ←→ iβŠ‘π•©, the fold sums all the elements in 𝕩 up to i, for i in the range of the length of the list. Ergo, a scan.

Depth of nested lists

Studying tree algorithms in APL, I learned about the depth vector representation. If the nested object in consideration is a string, the best approach is using boolean masks. However, when dealing with a BQN list, recursion becomes necessary to determine the depth of nested elements. Here’s how it can be implemented:

{=β—ΆβŸ¨β‹ˆ0, 1+Β·βˆΎπ•ŠΒ¨βŸ©π•©} ⟨1, ⟨2, ⟨3⟩, ⟨4, ⟨5, ⟨6, 7⟩⟩⟩⟩, 1⟩
⟨ 1 2 3 3 4 5 5 1 ⟩

H-index

This metric is one of the reasons for the deplorable state of modern academia, and the headaches for outsiders trying to get in. Consider that Peter Higgs has an estimated h-index of only 12. By contrast, a random professor nowadays boasts an h-index ten times as high, and exponentially less impact. Enough of ranting, let's concentrate on finding an elegant way to implement this useless thing:

HL ← (+Β΄βˆ˜Β«βŠ’Λœβ‰€+`⌾⌽)Β·/βΌβ‰ βŠΈβŒŠ
HS ← +´∨β‰₯1+βŠ’Λœ
(HL≑HS)β—Ά@β€ΏHL 14β€Ώ14β€Ώ11β€Ώ9β€Ώ5β€Ώ5β€Ώ1β€Ώ1β€Ώ1β€Ώ1β€Ώ0
5

If someone ever published that much, sorting will eventually be slower:

HLβ€ΏHS {π•Žβ€’_timed𝕩}Β¨< 1e8 β€’rand.Range 1e3
⟨ 0.083824959 0.21801262700000001 ⟩

A testament to the idea that the simplest solution in BQN is often the most efficient: I initially clip my citations array with {β‰ Β¨βŠ”β‰ βˆ˜π•©Β¨βŒΎ(β‰₯βŸœβ‰ βˆ˜π•©βŠΈ/)𝕩}, which is just /βΌβ‰ βŠΈβŒŠ.

Footnotes:

1

Almost Perfect Artifacts Improve only in Small Ways: APL is more French than English, Alan J. Perlis (1978). From jsoftware's papers collection.

2

Initially, I intended to rigorously attribute all contributions, but this quickly filled the text with footnotes. I often get help streamlining my solutions from Marshall Lochbaum (the BQN creator), dzaima (the CBQN developer), and other fine folks from the BQN matrix room, thank you all! Please check the logs for more context.

3

Don’t believe me? Just ask Kilgore Trout!