Shevek's crypto stuffShevek's cryto-stuf at Anarres (in English).
http://crypto.anarres.info/
Fri, 12 Jan 2018 16:48:23 +0100Fri, 12 Jan 2018 16:48:23 +0100Jekyll v3.7.0SIDH a quantum resistant algorithm for DH exchange<p>All algorithms that actually performs <a href="https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange">DH</a> exchange are susceptible
to be solved in polynomial time by a <a href="https://en.wikipedia.org/wiki/Quantum_computing">quantum computer</a>: say
based on <a href="https://en.wikipedia.org/wiki/Discrete_logarithm">discrete logarithm problem</a> or based on
<a href="https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman">elliptic curve point multiplication</a>. Beside these, the other
big algorithm that can perform public key encryption is <a href="https://en.wikipedia.org/wiki/RSA_(cryptosystem)">RSA</a>, which
has its strength in the impossibility of factoring big numbers, but
it also can be solved in polynomial time by a quantum computer. In fact,
factoring is shown as an example of how a <a href="https://en.wikipedia.org/wiki/Quantum_computing">quantum computer</a>
works by means of the <a href="https://en.wikipedia.org/wiki/Shor%27s_algorithm">Shor’s algorithm</a>.</p>
<p>A closer approach about quantum computing is given by the Github’s user
<a href="https://github.com/arikalfus">arikalfus</a> in her blog, with an interesting
<a href="https://akalfusblog.com/posts/quantum-mechanics">article series</a>.</p>
<p>A plethora of public key algorithms resistant to quantum computation
are now feverly developed by mathematicians and cryptographers. Among them,
<a href="https://en.wikipedia.org/wiki/Supersingular_isogeny_key_exchange">SIDH</a>, by <a href="http://ia.cr/2011/506">De Feo, Jao and Plût</a> shines because
it is still based on elliptic curves.
<strong>SIDH</strong> stands for <em>Supersingular isogeny Diffie-Hellman interchange</em>.
The aim of this article is to provide a rough description of what
SIDH is, avoiding too technical wording (my apologies to mathematicians
and cryptographers if my language is not accurate enough).</p>
<h1 id="some-definitions">Some definitions</h1>
<p>I assume that reader knows a bit about elliptic curves, how they are
used in public key cryptography, and so on. However I remember that
an elliptic curve is often expressed in its Weierstrass form,
$E:\,y^2 = x^3 + a x + b$. An elliptic curve in cryptography is usually
defined over a prime group such as $\mathbb{F}_p$, but also in a
polynomial ring of order $m$,
$\mathbb{F}_{p^m}$. For example, if $m=2$ it means that the field consist in
polynomials of grade 1, which coefficients are given in $\mathbb{F}_p$, and
with a grade 2 primitive reduction polynomial.</p>
<h2 id="j-invariant">j-invariant</h2>
<p>It is a property of some mathematical objects, among them, elliptic
curves. For a curve in Weierstrass form it is defined as</p>
<script type="math/tex; mode=display">j(E) = 1728 \frac{4a^3}{4a^3 + 27b^2}</script>
<p>Two curves are equivalent if both share the same j-invariant.</p>
<h2 id="supersingularity">Supersingularity</h2>
<p>We have a supersingular elliptic curve if some internal properties are
fulfilled. For the matter of this post, the interesting supersingular curves has
j-invariant values 0 or 1728. So, either $a=0$ or $b=0$.</p>
<h2 id="isogenies">Isogenies</h2>
<p>An isogeny is an endomorphism that transforms a given curve to another
$\phi(E) \rightarrow E’$ where the curve order is conserved, $\#E = \#E’$.
It is driven by a point of the curve, say $R$, in
such a way that the image of point $R$ is the so called <em>point at infinity</em>,
$\phi(R) = \{0\}$, so
we can label the transform by the point, $\phi_R(E) \rightarrow E’$.</p>
<p>An isogeniy $\phi_R$ is expressed as a set of formulas (known as <a href="http://eprint.iacr.org/2011/430.pdf">Velu’s
formulas</a>) depending on point $R$
that yields $\{a’,b’\}$ from $\{a,b\}$ and the transformation of any
point $Q$ in $E$ to $Q’$ in $E’$. In principle, they consists in a bunch of
field multiplications and divisions but repeated a numer of times equal to
the point order of $R$, but there are strategies to reduce the overload
(we will further see more about this).</p>
<p>In general, we can say that an isogeny is approximately as hard-computing as a
point multiplication.</p>
<h3 id="the-role-of-isogenies">The role of isogenies</h3>
<p>Before plunging into details of SIDH, it is worthy to show roughly the
role of isogenies in it.</p>
<p>Each agent of a DH interchange computes secretly a point $R$ as the
kernel of an isogeny. Then the isogeny is applied to a known point $P$ and
yields $P’$ in $E’$, what is made public.</p>
<p>It is hard to solve the kernel $R$ knowing $P$ and $P’$. For a
classical computer it can be solved in $\mathcal{O}(p^{1/4})$;
so, it is as hard as
<a href="https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman">the inversion of a point multiplication</a>.
If we use a <a href="https://en.wikipedia.org/wiki/Quantum_computing">quantum computer</a> the kernel can be solved in polynomial
time, $\mathcal{O}(\log\,p)$,
<strong>except if the original curve, $E$, is supersingular</strong>; then,
the difficulty of the problem rises to $\mathcal{O}(p^{1/6})$! That’s
the benefit of supersingular curves.</p>
<h1 id="sidh-description">SIDH description</h1>
<p>In SIDH this latter idea is nuclear. Let’s see how.</p>
<p>The field used is based on a prime, $p$, of the form
$ (\ell_A^{\,e_A})(\ell_B^{\,e_B})f \pm 1 $, where $\ell_A,\ell_B$
are small different primes (typically $\ell_A=2, \ell_B=3$) and $f$ is a
small number that fulfils $\gcd(f,\ell_A)=1,\gcd(f,\ell_B)=1$. The exponents
are chosen in such a way that both terms have similar size:
$\ell_A^{\,e_A} \simeq \ell_B^{\,e_B}$.</p>
<p>We will focus in primes $p = 3\,\mathrm{mod}\,4$; so
$p = (\ell_A^{\,e_A})(\ell_B^{\,e_B})f - 1 $ if $\ell_A = 2$ (if not, we
can force $f = 0\,\mathrm{mod}\,4$ because it is interesting to conserve
the “-1” form).</p>
<p>The field is chosen as $\mathbb{F}_{p^2}$ with the reduction primitive
polynomial $X^2 + 1$ (it is primitive thanks to the election
$p = 3\,\mathrm{mod}\,4$). Notice that this field has the same arithmetic as
complex numbers; in fact, we can write the elements of the field as
$a + bi$.</p>
<p>Now we define a supersingular curve over this field; for example, the simplest
one:</p>
<script type="math/tex; mode=display">y^2 = x^3 + x</script>
<p>Its cardinality is
$\#E = (p+1)^2 = (\ell_A^{\,e_A}\ell_B^{\,e_B}\cdot f)^2$; so,
the curve has many cyclic subgroups. We are interested in
$\ell_A^{\,e_A}$-torsion and $\ell_B^{\,e_B}$-torsion subgroups.</p>
<h2 id="public-bases">Public bases</h2>
<p>As in <a href="https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman">ECDH</a>, it must be defined a fixed point of the curve. Actually there
are 2 pairs of fixed points that are called A-base ($P_A,Q_A$) and B-base
($P_B,Q_B$), each with respective $\ell_A^{\,e_A}$ and $\ell_B^{\,e_B}$
point orders.</p>
<p>For the A-base we select a random seed point, $S_A$, then a $P_A$ candidate
is calculated as $P_A = (\ell_B^{\,e_B}\cdot f)^2 S_A$. Then $P_A$ is
probably a $\ell_A^{\,e_A}$-torsion point and we are done. If not, we search
another seed point. Now we must select an independent A-base point, $Q_A$;
the easiest way to do this is by applying a distortion map $\psi$ to $P_A$,
defined as $\psi : (x,y) \rightarrow (-x,iy)$. So $Q_A = \psi(P_A)$.</p>
<p>The B-base is selected correspondingly: a seed point $S_B$ is chosen, then
$P_B = (\ell_A^{\,e_A}\cdot f)^2 S_B$, etc.</p>
<p>So the partners of a SIDH interchange must share the following parameters
that define the system: $\ell_A,\ell_B,e_A,e_B,f,P_A,Q_A,P_B,Q_B$.</p>
<h2 id="the-interchange-mechanism">The interchange mechanism</h2>
<p>Now we are ready to understand how SIDH works. As usual, the partners are
Alice and Bob.</p>
<p>Alice uses A-base. She takes two random numbers $m_A,n_A$ (not both divisible
by $\ell_A$) and computes</p>
<script type="math/tex; mode=display">R = m_A P_A + n_A Q_A</script>
<p>so $R$ is a $\ell_A^{\,e_A}$-torsion point; $m_A,n_A$ can be considered
the Alice’s secret key; she also hides the point $R$. Alice now uses
$R$ as the kernel of an isogeny and calculates a new curve
$E_R = \phi_R (E)$ and the isogeny-driven image of the B-base,
$\phi_R (P_B),\phi_R (Q_B)$. Now she sends the set
$\{E_R,\phi_R (P_B),\phi_R (Q_B)\}$ to Bob.</p>
<p>In the mean time, Bob creates his own secret key $m_B,n_B$ (not both
divisible by $\ell_B$), computes the secret point</p>
<script type="math/tex; mode=display">T = m_B P_B + n_B Q_B</script>
<p>Bob proceeds <em>mutatis mutandis</em> and sends the parameters
$\{E_T,\phi_T (P_A),\phi_T (Q_A)\}$ to Alice. She computes</p>
<script type="math/tex; mode=display">R' = m_A \phi_T (P_A) + n_A \phi_T (Q_A)</script>
<p>also, the image of the curve, $E_{TR} = \phi_{R’} (E_T)$ and
its j-invariant $j(E_{TR})$. Similarly, Bob computes</p>
<script type="math/tex; mode=display">T' = m_B \phi_R (P_B) + n_B \phi_R (Q_B)</script>
<p>and ends up with $E_{RT} = \phi_{T’} (E_R)$ and $j(E_{RT})$.</p>
<p>The trick is: <strong>both curves are equivalent!</strong> That is</p>
<script type="math/tex; mode=display">j\left(\phi_{R'} \left(\phi_{T} (E) \right) \right) =
j\left(\phi_{T'} \left(\phi_{R} (E) \right) \right)</script>
<p>so they can use this common j-invariant —Alice computes $j(E_{TR})$
and Bob computes $j(E_{RT})$— as the shared key for a symmetric
encryption.</p>
<h2 id="security">Security</h2>
<p>As I said, the security lies on the practical impossibility of isogeny
inversion. So, although $\{\phi_R (P_B),\phi_R (Q_B)\}$ is
send over an insecure channel (or even made public), the kernel $R$ is hard to
be solved, even by a <a href="https://en.wikipedia.org/wiki/Quantum_computing">quantum computer</a> (if $R$ is found, then the
<a href="https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman">ECDH</a> problem is easy to solve, because of the “smooth” cardinality of $R$,
$\ell_A^{\,e_A}$, and the private key, $(m_A, n_A)$, can be revealed).</p>
<p>We can refine the estimation of isogeny inversion hardness exposed before:
for a normal computer it is
$\mathcal{O}(\min(\ell_A^{\,e_A/2}, \ell_B^{\,e_B/2}))$
and for a quantum computer,
$\mathcal{O}(\min(\ell_A^{\,e_A/3}, \ell_B^{\,e_B/3}))$.</p>
<h2 id="computation-of-isogenies">Computation of isogenies</h2>
<p>Along the post I talked about computing isogenies. This is a very tricky step,
and I will sketch it very roughly (the topic deserves a full post).</p>
<p>As I said, in principle compute an isogeny is repeat some field
operations (multiplications and divisions) as many times as the cardinality
of the kernel point states. But if the cardinality is “smooth”, that is, if
it is a compound of small numbers, then we can compound also “small” isogenies
driven by these small numbers and they builds up the full isogeny.</p>
<p>But, by construction, the isogeny kernel is just a point with cardinality
$\ell_A^{\,e_A}$ or $\ell_B^{\,e_B}$, so we deal with grained
computation of $\ell_A$-isogenies (respective $\ell_B$-isogenies). This
$\ell$-isogenies are “small” in the sense described above: they need a few
number of field operations, so the smaller $\ell$, the fewer operations are
needed.</p>
<p>And then, the number of $\ell$-isogenies needed to build up a $\ell^e$-isogeny
grows polynomially with exponent $e$. We’ll need also some $\ell$-point
multiplications which number also grows with $e$.
The funny part is that the way to combine $\ell$-isogenies and $\ell$-point
multiplications is not unique, but only few combinations are optimal regarding
performance.</p>
<h1 id="conclusion">Conclusion</h1>
<p>In SIDH both partners share the <a href="http://mathworld.wolfram.com/j-Invariant.html">j-invariant</a> of a curve, calculated via
two isogenies that are interchanged. The public knowledge of the interchanged
curves does not allow to know the final <a href="http://mathworld.wolfram.com/j-Invariant.html">j-invariant</a> without the private
multipliers used by the partners. If we want to break the security with a
normal computer, the problem is as hard as <a href="https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman">ECDH</a>; for a quantum computer
the complexity is also exponential, but only if the base curve is supersingular.</p>
Thu, 16 Nov 2017 00:00:00 +0100
http://crypto.anarres.info/2017/sidh
http://crypto.anarres.info/2017/sidhcryptoSIDHquantum-resistantisogenysupersingullarECDHSide-channel attack to modular inversion<p><a href="http://mathworld.wolfram.com/ModularInverse.html">Modular inversion</a> is a common mathematical operation that is given within
cryptographic algorithms based on <a href="https://en.wikipedia.org/wiki/Finite_group">finite groups</a> generated by a prime number.
Mainly, these algorithms are related
to public key cryptographic, specially, to <a href="https://en.wikipedia.org/wiki/Elliptic_curve_cryptography">Elliptic Curve Cryptography (ECC)</a>.
The way to compute a <a href="http://mathworld.wolfram.com/ModularInverse.html">modular inverse</a> is always hard; at least, it is roughly
100 times harder than the opposite operation, the modular product.</p>
<h1 id="algorithms--performance">Algorithms & performance</h1>
<p>Basically there are two main algorithms to calculate <a href="http://mathworld.wolfram.com/ModularInverse.html">modular inverses</a>: the
one based on <a href="http://mathworld.wolfram.com/TotientFunction.html">Euler’s totient theorem</a>, and the other based on the <a href="http://mathworld.wolfram.com/EuclideanAlgorithm.html">Euclides
GCD algorithm</a>.</p>
<h2 id="algorithm-based-on-eulers-totient">Algorithm based on Euler’s totient</h2>
<p><a href="http://mathworld.wolfram.com/TotientFunction.html">Euler’s totient theorem</a> states, for any $a$ coprime
of $p$ (so, if $p$ is prime, for all $a > 1$):</p>
<script type="math/tex; mode=display">a^{\phi(p)} = 1 \;(\mathrm{mod}\ p)</script>
<p>where $\phi(p)$ is the <a href="http://mathworld.wolfram.com/TotientFunction.html">Euler’s totient function</a>, which for $p$ prime
is $\phi(p) = p -1$; so, we can obtain the inverse by computing a modular
exponentiation:</p>
<script type="math/tex; mode=display">a^{-1} \,\mathrm{mod}\ p = a^{\phi(p) - 1} \;\mathrm{mod}\ p
= a^{p-2} \;\mathrm{mod}\ p</script>
<p>But modular exponentiation is quite costfull. This solution implies at least
$\log_2(p)$ modular squarings and about half of products (or less using some tricks).</p>
<h2 id="algorithm-based-on-euclides-gcd-algorithm">Algorithm based on Euclides’ GCD algorithm</h2>
<p>I write down one of the possibles Euclidean algorithms for <a href="http://mathworld.wolfram.com/ModularInverse.html">modular inverse</a>
in pseudocode (taken from <em>Elliptic Curve Cryptography</em> by D. Hanckerson, A. Menezes
& S. Vanstone):</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INPUT : Prime p and a ∈ [1, p − 1].
OUTPUT : a^(−1) mod p.
1. u ← a, v ← p.
2. x1 ← 1, x2 ← 0.
3. While (u != 1 and v != 1) do
3.1 While u is even do
u ← u/2.
If x1 is even then x1 ← x1 /2; else x1 ← (x1 + p)/2.
3.2 While v is even do
v ← v/2.
If x2 is even then x2 ← x2 / 2; else x2 ← (x2 + p)/2.
3.3 If u ≥ v then: u ← u − v, x1 ← x1 − x2 ;
Else: v ← v − u, x2 ← x2 − x1 .
4. If u = 1 then return(x1 mod p); else return(x2 mod p).
</code></pre></div></div>
<p>This is quite faster than the other (but still very slow compared to
product).</p>
<h1 id="algorithms--side-channel-attacks">Algorithms & side-channel attacks</h1>
<p>The Euclidean algorithm has the drawback that its execution time depends on the
input $a$. See, for example, the lines 3.1, 3.2, 3.3 where the code can go quicker
or slower depending on the “if” result. Then, if $a$ is sensible data —say a
private key—, the algorithm can leak information about it through timing or power
analysis.</p>
<p>On the other side, Euler algorithm, despite its bad performance, it is constant
time and does not depend on the input. This is why <a href="https://cr.yp.to/djb.html">Daniel Bernstein</a> elected
it for his <a href="https://cr.yp.to/papers.html#curve25519">Curve25519</a> and <a href="https://cr.yp.to/papers.html#ed25519">Ed25519</a> projects.</p>
<h2 id="a-simple-way-to-use-euclidean-algorith-safely">A simple way to use Euclidean algorith safely</h2>
<p>I know that there are efforts to modify Euclidean algorithm in a way it computes
the <a href="http://mathworld.wolfram.com/ModularInverse.html">modular inverse</a> at constant time, without depending on the input
$a$.</p>
<p>But there is a trick that allows using Euclidean algorithm without leaking
information about the inverted number:</p>
<p>A secret number $0 < k < p$ is obtained randomly. We can force $k$ to have set the
highest possible bit. So, we perform the following operations to obtain
the modular inverse of $a$:</p>
<script type="math/tex; mode=display">s = ka \,\mathrm{mod}\ p</script>
<script type="math/tex; mode=display">s' = \mathrm{EuclideanInverse}(s, p)</script>
<script type="math/tex; mode=display">a^{-1} = ks' \,\mathrm{mod}\ p</script>
<p>It works because</p>
<script type="math/tex; mode=display">ks' \,\mathrm{mod}\ p = k/(ka) \,\mathrm{mod}\ p
= 1/a \,\mathrm{mod}\ p</script>
<p>The algorithm may leak information about $s$, which is usesless for an attacker
because $k$ is secret. Indeed, if an inversion of same parameter is done, a
different $k$ is obtained, and then it is impossible to harvest any information
about $a$.</p>
<p>So by the price of two multiplications, we avoid the algorithm to leak information
about sensible data. It can be even cheaper, because the Euclidean algorithm
can perform $k/s \pmod p$ rightly replacing <code class="highlighter-rouge">x1 ← 1</code> by <code class="highlighter-rouge">x1 ← k</code> in the line 2. of
the code shown above. So,</p>
<script type="math/tex; mode=display">a^{-1} = \mathrm{EuclideanDivision}(k, s, p)</script>
<p><em>P.S. 2017-07-24</em> Of course this is not a new trick. It is very well known.
Even <a href="https://cr.yp.to/djb.html">dbj</a> knows it, as he write in <a href="https://cr.yp.to/ecdh/curve25519-20051115.pdf">the first article that describes
Curve25519</a>:</p>
<blockquote>
<p>This (Euler inversion) is about 7% of the Curve25519 computation. An
extended-Euclid inversion of $z$, <strong>randomized to protect against timing attacks</strong>,
might be faster, but the maximum potential speedup is very small, while the
cost in code complexity is large.</p>
</blockquote>
<p>Well, I simply do not agree. With the <a href="gmplib">The GNU Multiple Precision Arithmetic
Library</a> the code is quite simple!</p>
Sun, 23 Jul 2017 00:00:00 +0200
http://crypto.anarres.info/2017/modular-inversion
http://crypto.anarres.info/2017/modular-inversioncryptomathfinite-groupsThe maths of Secret Santa<p><strong>Secret Santa</strong> is a way to share gifts among work mates, family, etc.
It is organized
in such a way that every person is commited to gift secretly to another,
and the latter
does not know who is the gifter. In Spanish this game is called
<em>amigo invisible</em> (invisible friend).</p>
<p><em>Secret Santa</em> should take care of two points:</p>
<ol>
<li>It must be secret (nobody can know her <em>Santa</em>)</li>
<li>Whatever the assignment method, nobody must be her own <em>Santa</em></li>
</ol>
<p>How can we organize a Secret Santa?</p>
<p>The naïve approach for, say, 4 members group (Arthur, Beatrice, Charles
and Diana {A, B, C, D}) is creating a random <a href="https://en.wikipedia.org/wiki/Permutation">permutation</a>
of this set and putting it besides the original. For example
we have computed randomly the permutation {B, D, A, C}:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> A B C D
B D A C
</code></pre></div></div>
<p>It can be read as: Arthur gifts to Beatrice, Beatrice to Diana,
Charles to Arthur and Diana to Charles.</p>
<p>The problem arises when this random permutation puts a person besides herself,
that shouldn’t occur accordingly to the second condition expressed before;
for example, the permutation {C, B, D, A} is not allowed because Beatrice
would be her own Secret Santa.</p>
<p>So the question stands: How many permutations of the original
set are valid under the Secret Santa conditions? In other words, <strong>how
many permutations of the original set shifts all elements from
their starting positions?</strong></p>
<h1 id="analysis">Analysis</h1>
<p>So the matter is the $n$-element set permutation analysis. The
number of permutations is $P(n) = n!$ and we will called $p(n)$
the number of valid Secret Santa permutations.</p>
<h2 id="the-easy-cases">The easy cases</h2>
<p>Let’s start with the easiest cases. By simple finger calculation
we obtain:</p>
<ul>
<li>Case $n = 1$. Then $P(1) = 1$ and, obviously $p(1) = 0$
since the only participant can’t gifts herself.</li>
<li>Case $n = 2$. Then $P(2) = 2$ and $p(2) = 1$. Say the set is {A, B}
then the only valid permutation is {B, A}.</li>
<li>Case $n = 3$. Then $P(3) = 6$ and $p(3) = 2$; As we chan easily check
for the initial set {A, B, C} there are only two valid permutations:
{B, C, A} y {C, A, B}, that correspond to circular shifts by one element.</li>
</ul>
<p>From now on we must proceed carefully. Let’s plunge into the non-trivial cases.</p>
<h2 id="4-elements">4 elements</h2>
<p>Let’s begin with the next step in complexity (4 elements). Firstly we try
to calculate the number of permutations where
<strong>there are coincidences (at least one)</strong> and we define this concept
as $\pi(n)$. So:</p>
<script type="math/tex; mode=display">p(n) = P(n) - \pi(n)</script>
<p>In addition we define permutations with partial coincidences. So $\pi_1(4)$
is the number of permutations of 4 elements set where one, and only one, of its
elements coincides with the starting permutation; $\pi_2(4)$,
$\pi_3(4)$, $\pi_4(4)$ represents the number of permutations where coincide
respectively 2, 3, and 4 elements. So for general $n$:</p>
<script type="math/tex; mode=display">\pi(n) = \sum_{i=1}^{n} \pi_i(n)</script>
<h3 id="1-coincidence-4-elements">1 coincidence (4 elements)</h3>
<p>We calculate $\pi_1(4)$. Let’s suppose that there is a coincidence in the first
element. The other 3 elements <strong>can’t coincide</strong>; so we know that the number
of non-coincident permutations of three elements are $p(3) = 2$. Since we
can fix 4 positions for legitimate single coincidence we have $4\times 2 = 8$
combinations where only coincides one element. So:</p>
<script type="math/tex; mode=display">\pi_1(4) = 4 \times p(3) = 8</script>
<h3 id="2-coincidences-4-elements">2 coincidences (4 elements)</h3>
<p>When we fix 2 coincident elements there are other 2 that must not coincide.
So, that is $p(2) = 1$. Looking again at the coincident pair we must calculate
all the <a href="https://en.wikipedia.org/wiki/Combination">combinations</a> of 4 elements taking by pairs, that is</p>
<script type="math/tex; mode=display">\binom{4}{2} = \frac{4 \times 3}{2} = 6</script>
<p>Thus,</p>
<script type="math/tex; mode=display">\pi_2(4) = \binom{4}{2} p(2) = 6</script>
<h3 id="3-coincidences-4-elements">3 coincidences (4 elements)</h3>
<p>There are <strong>no 3 coincidences and 1 void</strong> y a 4 element set:</p>
<script type="math/tex; mode=display">\pi_3(4) = \binom{4}{3} p(1) = 0</script>
<h3 id="4-coincidences-4-elements">4 coincidences (4 elements)</h3>
<p>That is trivially 1:</p>
<script type="math/tex; mode=display">\pi_4(4) = 1</script>
<p>In general it is easy to see that $\pi_n(n) = 1$.</p>
<h3 id="total-number-of-coincidences">Total number of coincidences</h3>
<p>So,</p>
<script type="math/tex; mode=display">\pi(4) = 8 + 6 + 0 + 1 = 15</script>
<p>And then, the number of valid permutations for a Secret Santa with 4 friends is 9:</p>
<script type="math/tex; mode=display">p(4) = P(4) - \pi(4) = 24 - 15 = 9</script>
<p>Unfortunately the proposed method is not constructive and thus it does not provide
the 9 legitimate permutations. These are:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{B, C, D, A}, {C, D, A, B}, {D, A, B, C},
{B, A, D, C}, {C, D, B, A}, {D, C, B, A},
{C, A, D, B}, {B, D, A, C}, {D, C, A, B}
</code></pre></div></div>
<h2 id="general-method">General method</h2>
<p>So it is easy to deduct the general formula for
$\pi(n)$, being $n > 2$:</p>
<script type="math/tex; mode=display">\pi(n) = 1 + \sum_{i=1}^{n-2} \binom{n}{i} p(n-i)</script>
<p>or in terms of $p(n)$</p>
<script type="math/tex; mode=display">p(n) = n! - 1 - \sum_{i=1}^{n-2} \binom{n}{i} p(n-i)</script>
<p>The counterpart is that this formula is recursive; in order to get
$p(n)$ is needed to have the previous $n-1$ values.
For example, the calculation for $n=5$:</p>
<script type="math/tex; mode=display">p(5) = 120 - 1 -\left(5\cdot 9 + 10\cdot 2 + 10\cdot 1 \right) = 44</script>
<h1 id="a-surprise">A surprise!…</h1>
<p>$p(n)$ is always lower than $P(n)$. Looking at the so far calculated values
it seems that there is a proportional factor. May it varies, diverges oscillates
or perhaps converges with $n$? Let’s see with the following table, with an
extra $n=6$ case added:</p>
<table>
<thead>
<tr>
<th style="text-align: center">$n$</th>
<th style="text-align: center">$P(n)$</th>
<th style="text-align: center">$p(n)$</th>
<th style="text-align: center">$P(n)/p(n)$</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">2</td>
<td style="text-align: center">2</td>
<td style="text-align: center">1</td>
<td style="text-align: center">2.000</td>
</tr>
<tr>
<td style="text-align: center">3</td>
<td style="text-align: center">6</td>
<td style="text-align: center">2</td>
<td style="text-align: center">3.000</td>
</tr>
<tr>
<td style="text-align: center">4</td>
<td style="text-align: center">24</td>
<td style="text-align: center">9</td>
<td style="text-align: center">2.667</td>
</tr>
<tr>
<td style="text-align: center">5</td>
<td style="text-align: center">120</td>
<td style="text-align: center">44</td>
<td style="text-align: center">2.727</td>
</tr>
<tr>
<td style="text-align: center">6</td>
<td style="text-align: center">720</td>
<td style="text-align: center">265</td>
<td style="text-align: center">2.720</td>
</tr>
</tbody>
</table>
<p>So, it looks that there is a convergence… to what value? Isn’t that
convergence number familiar, $2.72\ldots$? Indeed it is! It is the
famous number $e$, base of natural logarithm, $e = 2.71828\ldots$
In fact we can propose the following formula (without formal proof)
of $p(n)$ that does not need the previous $n-1$ values:</p>
<script type="math/tex; mode=display">p(n) = \lfloor n!/e + 0.5 \rfloor</script>
<p>It predicts all the previously tabulated values. We can precict now
the term $n=7$:</p>
<script type="math/tex; mode=display">7!/e = 5040/2.71828\ldots \simeq 1854.11</script>
<p>Hence $p(7) = 1854$. I invite the reader to check this result with the
exact formula.</p>
<p>So, if we generate a random permutataion the probability it allows a valid
Secret Santa is $1/e$ or, say, 37%.</p>
<p>Isn’t it curious that searching about gift sharing by means of
Secret Santa we arrived to number $e$!?</p>
<hr />
<p><em>P.S. 2016-12-04</em> Reddit user <a href="https://www.reddit.com/user/louiswins">louswins</a>
points out that this kind of permutations are called
<a href="https://en.wikipedia.org/wiki/Derangement">derangements</a></p>
Thu, 01 Dec 2016 00:00:00 +0100
http://crypto.anarres.info/2016/secret_santa_maths
http://crypto.anarres.info/2016/secret_santa_mathsmathpermutationsRC4 as pencil & paper cipher<p><a href="https://en.wikipedia.org/wiki/RC4">RC4</a> is a well-known stream cipher, extremely simple
—I’d say <em>minimalist</em>— and strong enough to be
still used, spite of some documented weaknesses which, mostly,
fall on the key schedule.</p>
<p><a href="https://en.wikipedia.org/wiki/RC4">RC4</a> uses a 256-symbol alphabet, so it operates at
byte level. The main object of <a href="https://en.wikipedia.org/wiki/RC4">RC4</a> is a S-box, <code class="highlighter-rouge">S</code>, which contains a
permutation of the alphabet driven by a key through a key
schedule. Then, each iteration modifies the permutation and
yields a byte used to mask the plaintext. The pseudocode:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">i</span> <span class="o">:=</span> <span class="mi">0</span>
<span class="n">j</span> <span class="o">:=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">GeneratingOutput</span> <span class="p">{</span>
<span class="n">i</span> <span class="o">:=</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">mod</span> <span class="mi">256</span>
<span class="n">j</span> <span class="o">:=</span> <span class="p">(</span><span class="n">j</span> <span class="o">+</span> <span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="n">mod</span> <span class="mi">256</span>
<span class="n">swap</span> <span class="n">values</span> <span class="n">of</span> <span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="n">and</span> <span class="n">S</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="n">T</span> <span class="o">:=</span> <span class="n">S</span><span class="p">[(</span><span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">S</span><span class="p">[</span><span class="n">j</span><span class="p">])</span> <span class="n">mod</span> <span class="mi">256</span><span class="p">]</span>
<span class="n">output</span> <span class="n">T</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This simplicity allows us to build a pencil-and-paper (p&p) cipher
based on its design.</p>
<h1 id="algorithm">Algorithm</h1>
<p>Our version will use a 26-symbol alphaber, the
common English one with letters <code class="highlighter-rouge">A-Z</code>. Each letter have internally
a numeric value from 0 to 25: <code class="highlighter-rouge">A = 0, B = 1, ... Z = 25</code>. This value
can be used to mask the plaintext by adding the cipher output mod 26;
also, addition modulo 26 can be arranged with a table, so crossing
both summands in a row/column the value is quickly given.</p>
<p>We need a paper strip where we write two instances of the alphabet. We
have also a deck of 26 cards with a different letter drawn on each one;
or a set of 26 <a href="https://en.wikipedia.org/wiki/Scrabble">scrabble</a> pieces with all possible letters.</p>
<p>The key will be taken by shuffling the deck; so a it is a random
permutation of the alphabet; for example:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> R Q G N Y P H I T M D B C E J K Z F A U X W O V L S
</code></pre></div></div>
<p>Now we put the key over the strip, fully coincident with one of the
instances of the alphabet (the first letter of the key beside the <code class="highlighter-rouge">A</code>
letter of the alphabet). The first letter of both are marked with
a pointer, say a bean or a coin:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ·
R Q G N Y P H I T M D B C E J K Z F A U X W O V L S
... Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
·
</code></pre></div></div>
<h2 id="procedure">Procedure</h2>
<p>Now we move the upper pointer one position to the right
(in case of overflow go to the begining) and rise
the pointed letter (<code class="highlighter-rouge">Q</code>); then we move the lower pointer
to the position represented by this one on the lower alphabet,
and rise the <strong>upper letter</strong> (<code class="highlighter-rouge">Z</code>). This can be summarize in
the following figure:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Q Z
R · G N Y P H I T M D B C E J K F A U X W O V L S
... Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
·
</code></pre></div></div>
<p>Both letters <code class="highlighter-rouge">[Q,Z]</code> are interchanged and their values added modulo
26: $16 + 25 \mod 26 = 15$ (this addition can be done
with a table, as mentioned), so <code class="highlighter-rouge">Q + Z = P</code>. We look for <code class="highlighter-rouge">P</code> in the
lower alphabet and selects the upper letter as output:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> · *
R Z G N Y P H I T M D B C E J K Q F A U X W O V L S
... Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
^ ·
</code></pre></div></div>
<p>So, we get <code class="highlighter-rouge">K</code> as the output and we use it to mask the first plaintext letter.</p>
<p>The last step is to shift the paper
strip (lower alphabet) in such a way that an <code class="highlighter-rouge">A</code> is placed besides
the lower pointer:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ·
R Z G N Y P H I T M D B C E J K Q F A U X W O V L S
... J K L M N O P Q R S T U V W X Y Z A B C D E F G H I J K ...
·
</code></pre></div></div>
<p>Then we are ready for the next iteration.</p>
<h2 id="pseudocode">Pseudocode</h2>
<p>If we translate to computer language this procedure, it shows a tiny deviation
from standard <a href="https://en.wikipedia.org/wiki/RC4">RC4</a> algorithm as shown earlier in this post. This is the new
algorithm:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">i</span> <span class="o">:=</span> <span class="mi">0</span>
<span class="n">j</span> <span class="o">:=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">GeneratingOutput</span> <span class="p">{</span>
<span class="n">k</span> <span class="o">:</span> <span class="o">=</span> <span class="n">j</span>
<span class="n">i</span> <span class="o">:=</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">mod</span> <span class="mi">26</span>
<span class="n">j</span> <span class="o">:=</span> <span class="p">(</span><span class="n">j</span> <span class="o">+</span> <span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="n">mod</span> <span class="mi">26</span>
<span class="n">swap</span> <span class="n">values</span> <span class="n">of</span> <span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="n">and</span> <span class="n">S</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="n">T</span> <span class="o">:=</span> <span class="n">S</span><span class="p">[(</span><span class="n">S</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">S</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">+</span> <span class="n">k</span><span class="p">)</span> <span class="n">mod</span> <span class="mi">26</span><span class="p">]</span>
<span class="n">output</span> <span class="n">T</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I think this modification adds nothing (good or bad) to the security of
the algorithm but makes it friendly por p&p implementation.</p>
<h2 id="key-schedule">Key schedule</h2>
<p>In this p&p version there is no need of a key schedule. The alphabet permutation
can be transmitted as a key. Nevertheless it is possible to use a numeric key
to obtain a permutation from it. The number of permutations
is $26!$ and, if you want to set the initial positions of counter as part of the
key then $(26!)26^2$ is the number of all possible combinations. So
$\log_2 \left( 26! \times 26^2 \right) \simeq 98\,\mathrm{bits}$; a random
number of 98 bits can be used to calculate an initial state setting.</p>
<h1 id="a-small-challenge">A small challenge</h1>
<p>Now I propose a small challenge: with the key given in the example
I encrypted a text as:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ZDNSG FRZFF KJFTS QIGEY C
</code></pre></div></div>
<p>What is the original plaintext?</p>
<h1 id="conclusion">Conclusion</h1>
<p>Due to the simple <a href="https://en.wikipedia.org/wiki/RC4">RC4</a> design it is possible to build over it a p&p cipher
with a typical 26-letter alphabet. The key space is 98 bit large —not too
bad for a toy cipher— and key schedule is not needed. But other weakness
of original <a href="https://en.wikipedia.org/wiki/RC4">RC4</a> (say, the imperfect random oracle behavior) can be
amplified because of the shrinked alphabet.</p>
<hr />
<p><em>P.S. 2016-11-04</em> Johannes Bauer gently provided a
<a href="../public/progs/rc4-26.py">nice python program</a> to test the algorithm.</p>
Sun, 23 Oct 2016 00:00:00 +0200
http://crypto.anarres.info/2016/rc4_pencilandpaper
http://crypto.anarres.info/2016/rc4_pencilandpaperrc4p&pcrypto