In the Optimizations chapter we have briefly described optimizations based on specializations and caching. In this text we describe the caches used by PetitParser2.
PetitParser2 utilizes four different caches that can be split into two categories:
You can inspect the performance of caches in the cache view:
create a screenshot with the memoization cache and fix the paragraph.
There are three caches visble in the cache view: (i) cache and (ii) trimming cache belong to the first category and (iii) sea memoization cache belongs to the second category. We discuss these caches briefly in the following text.
Cache improves performance of choice that has alternatives with the same prefix.
unary (e.g. parser) and
keyword (e.g parser:) in Smalltalk:
Normally, if the
identifier is not followed by
:, the parser backtracks and parses identifier again.
Optimizations of PetitParser2 can detect such a case and protect
identifier with a cache, preventing superfluous invocation of
Because each token trims whitespaces before and after itself, the trimming in between tokens happens twice. The trimming cache prevents this. Because trimming whitespace is usually rather fast operation, the trimming cache has impact only for very fast almost deterministic and context-free grammars such as Smalltalk grammar.
The sea memoization cache is tailored to bounded seas. Due to the nature of bounded seas, in some cases parsing a grammar with bounded seas can result in an exponential time complexity, performing lots of redundant work. Because the traditional cache would not be able to reduce the exponential overhead (it can only reduce the overhead by a constant), the seas are memoized. Memoization requires more memory, but it gives bounded seas the good old linear complexity.