AI
A startup claims it broke through a bottleneck that’s holding back LLMs
Miami-based AI startup Subquadratic has shared the results of an independent evaluation by Appen to validate claims about its new LLM, SubQ.
Miami-based AI startup Subquadratic has shared the results of an independent evaluation by Appen to validate claims about its new LLM, SubQ. The model uses dynamic sparse attention instead of traditional dense attention to bypass the quadratic scaling bottleneck. Appen's tests found SubQ was 56 times faster than FlashAttention, scored 89.7% on LiveCodeBench, and achieved a 98% retrieval score on needle-in-a-haystack tests with context windows up to 12 million tokens. However, the model is not yet widely available, and Subquadratic bootstrapped SubQ by reusing weights from the Chinese open-source model Qwen.
By the numbers
- 12x
- Text processing capacity compared to most other models
- 56x
- Speed increase over FlashAttention in Appen's baseline test
- 89.7%
- SubQ score on the LiveCodeBench coding test
- $2,600
- Cost to run Anthropic's Opus 4.6 through RULER 128
- 98%
- SubQ needle-in-a-haystack retrieval score at massive scale