If we are looking at an encrypted homophonic English ciphertext (a fairly reasonable assumption here), is there a notable difference between the left-context entropy (i.e.
the information content of the text using the preceding letter as a context for predicting the next letter) with the right-context entropy?
I’d much rather have seen that tested than Vigenere (it’s not a Vig, not even close).As I was reading through Juzek’s paper, I was struck by a quite different question.That is, might encrypted homophonic English ciphertexts have a distinctly asymmetrical statistical “fingerprint” that would give us confidence that this is indeed what we are looking at in the Z340?Perhaps this has already been calculated: if so, it’s not work that I’m aware of, so please leave a comment here to help broaden my mind.Hence it contains 8 x AAA, 1 x AAB, 1 x ABC, and 1 x BCD: and so would have a trigram MSD of (8*8 1*1 1*1 1*1) / 12 = 5.58.
However, Juzek quickly flags that this raw metric is not really good enough on its own: The problem with the msd is that there are difficulties with comparing msd’s across data sets.
If the same calculations were repeated with different order ngram entropies, I think we might have something more interesting to work with here: but that’s already been done to death in the Zodiac Killer research world.
Moreover, the long-standing suggestion (which I think has a fair amount of evidential support) that the Z340 may well have been constructed in two distinct halves (Z170A and Z170B) would also mess with just about all of his arguments and conclusions.
Juzek then applies these two final metrics (bigram delta MSD and trigram delta MSD) to a number of real and fake ciphers, before concluding that the Z340 is quite unlike the Z408, and that it in fact presents more like fake ciphers than real ciphers.
Clearly, Juzek’s motivation for squaring ngram instance counts at all is to try to somehow ‘reward’ ngrams that are repeated in a given text being tested.
This is because the length of a text influences the msd, as well as the length of a text’s character set.