-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy paththesis.tex
1094 lines (890 loc) · 79.9 KB
/
thesis.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,
oneside,openany, % for one-sided printing and screens
%draft % enable to compile faster and test layout
]{scrreprt}
% Settings
\usepackage[table]{xcolor}
\usepackage[ngerman,english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{geometry}
\usepackage{microtype}
\usepackage[all]{nowidow}
\usepackage{csquotes}
\usepackage{dsfont}
\usepackage[bsc, % for Bachelor's Thesis titlepage
%project, % for Master's Project titlepage
claim,iaik]{iaikthesis}
\addtokomafont{disposition}{\rmfamily}
% <INSERT YOUR CUSTOM MACROS AND PACKAGES HERE>
\usepackage{booktabs}
\usepackage{caption}
\usepackage{upgreek}
\usepackage[linesnumbered, ruled, lined, boxed]{algorithm2e}
\usepackage[skip=10pt plus1pt, indent=10pt]{parskip}
\usepackage{graphicx}
\usepackage{minted}
\RecustomVerbatimEnvironment{Verbatim}{BVerbatim}{}
\usemintedstyle{colorful}
\usepackage{tcolorbox}
\usepackage{multirow}
\newtcbox{\inlinecode}{on line, boxrule=0pt, boxsep=0pt, top=2pt, left=2pt, bottom=2pt, right=2pt, colback=gray!15, colframe=white, fontupper={\ttfamily \footnotesize}}
% source: https://tex.stackexchange.com/questions/2705/typesetting-column-vector
\newcommand{\gmidrule}{\arrayrulecolor{lightgray}\specialrule{\lightrulewidth}{0.4\aboverulesep}{0.6\belowrulesep}\arrayrulecolor{black}}
\newcount\colveccount
\newcommand*\colvec[1]{
\global\colveccount#1
\begin{pmatrix}
\colvecnext
}
\def\colvecnext#1{
#1
\global\advance\colveccount-1
\ifnum\colveccount>0
\\
\expandafter\colvecnext
\else
\end{pmatrix}
\fi
}
% algorithm comment
\SetKwComment{Comment}{/* }{ */}
% floor cmd
\newcommand{\floor}[1]{\lfloor {#1} \rfloor}
\newcommand{\ceil}[1]{\lceil {#1} \rceil}
% Useful packages for complex content:
\usepackage{amsmath,amsfonts,amssymb} % typesetting math
\usepackage{centernot}
%\usepackage{tikz} % typesetting diagrams, figures, ...
%\usepackage{siunitx} % typesetting SI-units and formatted numbers
%\usepackage{listings} % typesetting source code
%\usepackage{booktabs,multirow} % utils for complex/beautiful tables
%\usepackage{subcaption} % placing multiple subfigures in a figure
% Useful utils:
\usepackage{blindtext} % insert dummy text for this template
%\usepackage{todonotes} % add ToDo markers (\todo{...})
%\usepackage{xspace} % define macros that don't eat the following space (add \xspace at the end)
%\usepackage[section]{placeins} % prevent figures from floating into the wrong section (\FloatBarrier)
% Bibliography, referencing, and indexing
\usepackage{csquotes} % typesetting \enquote{text in quotes} correctly
\usepackage[backend=biber,
style=alphabetic,
maxbibnames=20]{biblatex} % to generate the bibliography
\addbibresource{thesis.bib} % name of the bib-file
\usepackage[hidelinks]{hyperref} % clickable links (but hide color frames around links)
%\usepackage{cleveref} % named references (\Cref{chap:introduction}, ...)
\usepackage[toc,acronym,style=long3col]{glossaries} % List of acronyms and symbols (optional)
\makenoidxglossaries
%
\newacronym{aes}{AES}{Advanced Encryption Standard}
\newglossaryentry{xor}{name={\ensuremath{\oplus}}, sort=xor,
description={exclusive-or (\textsc{Xor})}}
% <INSERT YOUR GLOSSARY ENTRIES HERE (or \input{} a dedicated file) >
\begin{document}
%--- INSERT INFORMATION FOR TITLEPAGE ------------------------------------------
% Your name + previous academic degrees:
\thesisauthor{Bernhard Berg}
% Title of your thesis:
\thesistitle{Classic McEliece in Rust}
% Date of completion:
\thesisdate{June 2022}
% Supervisor:
\supervisortitle{Supervisors} % or Supervisor
\supervisor{%
Lukas Prokop, Daniel Kales\\
\smallskip
Institute of Applied Information Processing and Communications\\
Graz University of Technology
}
% Name of your degree programme according to your curriculum
%\curriculum{CurriculumName}
%\curriculum{Information and Computer Engineering}
%\curriculum{Computer Science}
\curriculum{Software Engineering and Management}
%--- FRONT MATTER --------------------------------------------------------------
\printthesistitle
\chapter*{Abstract}
The asymmetric McEliece cryptosystem, although previously unpopular in the cryptographic community due to its large key size, has become an exciting candidate in post-quantum cryptography because the hardness assumption of the underlying cipher remains unaffected by quantum computer attacks. In 2016 the NIST (National Institute of Standards and Technology) initiated the open post-quantum cryptography competition where the key-encapsulation mechanism classic McEliece is a round 3 candidate. The goal of this bachelor thesis is to understand the McEliece cryptosystem and port the classic McEliece implementation from the C programming language to the Rust programming language. Rust is a language that was developed with the focus on memory safety, specifically enabling safe low-level memory manipulation while at the same time offering multi-paradigm language features such as generics, pattern matching, iterators, and many more. In the course of this thesis, a Rust implementation of classic Mceliece was published as a crate (library) on the Rust community crate registry. The Rust implementation yields faster results than the C reference implementation in speed benchmarks.
\paragraph{Keywords:}
Post-quantum cryptography, Rust programming language, code-based cryptography, Classic McEliece, binary goppa codes
%\cleardoublepage
\tableofcontents % optional
%--- MAIN CONTENT --------------------------------------------------------------
% begin equation starts equation block which gets incremental number annotated as text
% begin equation* <- start equation block without number annotation
\chapter{Introduction}
\label{chap:introduction}
In the current digitalized world, the field of cryptography represents a technical option to maintain a secure information exchange. Its purpose is to draft and verify protocols that ensure data integrity (no third-party modifications), confidentiality (no third-party inspection), and authenticity (the receiver of a message can verify the sender). The real-world applications range from payment systems in digital commerce to secure messenger applications for smartphones, where low-level cryptographic primitives enable a secure communication channel to avoid information leakage to potential adversaries.
In order to enable the security property of many cryptographic protocols, one-way functions are incorporated in these schemes which are comparatively feasible to compute in a forward direction independent on the input, however incredibly difficult to inverse a certain output to the desired input again~\cite{Kryptografie2016}.
\begin{gather*}
\text{A generic function} \ f() \ \text{is defined as a one-way function if the following conditions hold:}\\
a = f(b) \ \ \text{computationally feasible to calculate a from b}\\
b = f^{-1}(a) \ \ \text{computationally infeasible to compute b from a}\\
\end{gather*}
In this context, the term computationally feasible is defined as that the function is computable in deterministic polynomial runtime where the polynomial degree determines the growth of the described algorithm. Such functions are considered practically solvable in the complexity class $P$ (polynomial).
A trapdoor function is a particular case of a one-way function where the same property holds that computation in the forward direction is feasible. However, calculating the inverse of a trapdoor function is only possible with the knowledge of the secret "trapdoor" information. Contrary to one-way functions, calculating the inverse of a trapdoor function recovers the complete input. For instance, additionally, the encryption randomness can be obtained~\cite{trapdoor}.
Up to date, two of the most essential one-way functions used in cryptographic schemes are the integer factorization problem (used in the RSA public-key cryptosystem) and the discrete logarithm problem, which is applied in public-key cryptographic systems using elliptic curves and Diffie–Hellman key exchange~\cite{Kryptografie2016}.
It is essential not to assume that one-way functions remain indefinitely secure. Their security is only valid as long as no method efficiently solves the backward direction. In 1997 Peter W. Shor published efficient randomized algorithms which solve the integer factorization and the discrete logarithm problems, given a theoretically powerful quantum computer on which these algorithms are exectued~\cite{pshor1997}. The implications of these findings triggered the scientific community to think about the future of cryptography once quantum computers are built, capable of fast and reliable integer factorization.
\section{Post-quantum cryptography}
The objective of post-quantum cryptography is to design protocols that are, to the best of our knowledge, secure against targeted quantum-computer attacks, in addition to being secure against classical adversarial approaches. In order to concentrate efforts on reaching this goal, the NIST (The National Insitute of Standards and Technology of the United States) organized an open competition for quantum-resistant public-key encryption, key-establishment, and digital signature algorithms in December 2016~\cite{nistPQCcompetitionAnnounce}.
The current scientific advancements in post-quantum cryptography can be categorized in the following research areas: Lattice-based cryptography, Hash-based cryptography, Multivariate cryptography, Isogeny-based cryptography, and Code-based cryptography.
In 1978 a public-key cryptosystem based on random binary irreducible error-correcting goppa codes was published by Robert J. McEliece~\cite{robertmceliece}. The one-way functionality of the McEliece cryptosystem relies on the decoding problem of binary goppa codes. Over the years, multiple cryptanalytic attacks have been published on the McEliece cryptosystem. However, with adjusted system parameters, the scheme remains unbroken as of today and has been shown to resist quantum attacks~\cite{engelbertOverbeckSchmidt, bernsteinLange, DinhMooreRussell}. Thus the security confidence in this cryptographic protocol is high, and since it does not rely on a problem currently known to be broken by quantum computers, the coding-based McEliece cryptosystem is a suitable candidate for post-quantum cryptography. The NIST competition lists "Classic McEliece" as a current round 3 finalist for public-key encryption and key-establishment algorithms~\cite{nistRound2summary}. Classic McEliece is a key encapsulation mechanism (KEM) built on a variant of the McEliece public-key encryption scheme using binary goppa codes.
The authors of the round 3 submission have provided a public package containing an implementation of Classic McEliece in the programming language C with detailed supporting information~\cite{classicMcElieceSubmission3}. While sophisticated research is being done on upcoming post-quantum cryptographic frameworks, another critical problem represents security vulnerabilities in cryptographic libraries. The security of internet network communication relies on cryptographic implementations, and concealed exploits in such code pose a considerable threat. The comprehensive review by Blessing et al~\cite{youShouldNtRollOwnCrypto} shows that $37.2\%$ of vulnerabilities originate from memory-safety issues in contrast to $27.2\%$ being cryptographic errors. The programming language Rust attempts to provide a solution to various memory safety bugs (for instance, buffer-overflows, use-after-free) by introducing strict semantic compiler checks for such errors. Rust has been created at Mozilla Research to maintain fast runtime performance while at the same time offering safe memory and type manipulations. The core of safe memory handling in Rust is its concept of ownership. Each memory resource where read/write manipulations are desired must have a reference variable flagged as its "owner". Only with this special entitlement the memory operations are allowed to proceed. With this concept, Rust aims to prevent dangling references which have not been invalidated, such as freed pointers in C/C++, which could still be accessed~\cite{RustBelt, memsafetyHui}.
\section{Contributions}
A pure Rust implementation has been carried out in the course of this thesis, which provides the Classic McEliece KEM to the Rust programming language.
In the background section, basic concepts of linear algebra, coding theory, KEM, and the McEliece cryptosystem are explained. Furthermore, a chapter with more details on Rust's memory safety concepts is provided. In the implementation \& evaluation section, the experienced challenges of porting C language code to Rust, benchmark comparison, and differences in the implementations will be explained.
The source code is released on Github:
\url{https://github.com/Colfenor/classic-mceliece-rust}
and as a crate on Rust crates.io registry:
\url{https://crates.io/crates/classic-mceliece-rust}
% You may want to keep your chapters in separate files ('chapterfilename.tex')
% and include them here using either
%\input{chapterfilename.tex} or \include{chapterfilename}
\chapter{Background}
\label{chap:background}
\newcommand\spacebetweenparandheader{\vspace{20pt}}
A quick introduction to linear codes and a few basic definitions of linear algebra and coding theory are given first as fundamentals to understand the McEliece cryptosystem. Furthermore, it is assumed that the reader is acquainted with finite fields $\mathbb{F}_{q}$, the matrix multiplication operation, calculating the inverse of a matrix, and linear equation solving techniques. The textbooks "Abstract algebra" by Robinson~\cite{AbstractAlgebraRobinson} and "Linear Algebra" by Liesen and Mehrmann ~\cite{liesenlineareAlgebra} provide further background for these topics.\spacebetweenparandheader
%\vspace{0.2cm}
\section{Linear Algebra}
\subsection{Vector space $(\mathbb{F}_{q})^{n}$}
For a vector space the following is defined, given an arbitrary field $\mathbb{F}$ and a set $S$ and the elements $a, b, c \in S$,
\begin{enumerate}
\item Vector addition:
\begin{gather*}
a + b = b + a \ \ \ \ commutative\\
a + (b + c) = (a + b) + c \ \ \ \ associative\\
\exists 0 \in S: a + 0 = a \ \ \ \ existence \ of \ neutral \ element\\
\exists (-a) \in S: a + (- a) = 0 \ \ \ \ existence \ of \ inverse \ element
\end{gather*}
The inverse to $a$ is $-a$ and the zero vector is the neutral element $0$.\vspace{0.5cm}
\item Multiplication with a scalar value:
If there exists an element $a \in S$ and a element $k \in \mathbb{F}$ then there exists the element $ka \in S$ with the following properties:
\begin{gather*}
k(h a) = (kh)a\\
1a = a\\
k(a + b) = ka + kb \ \ \ \ (distributive)\\
(k + h)a = ka + ha\\
\end{gather*}
for all $a, b \in S$ and $h, k \in \mathbb{F}$.
If these properties hold then $S$ is a vector space over the field $\mathbb{F}$ and the elements of the vector space $S$ are called vectors and the elements of $\mathbb{F}$ are called scalars~\cite{mathefuerInfoBand1}.
\end{enumerate}
\subsection{Linearity}
In the context of vector spaces, a mapping of two vector spaces X, Y, \\ $F:X \xrightarrow{} Y $ is linear if for all vectors $c, d \in X$ and all scalar values $v \in \mathbb{F}$ additivity and homogeneity holds:
\begin{gather*}
F(c + d) = F(c) + F(d)\\
F(v \ c) = v F(c)
\end{gather*}
\subsection{Dimension of a vector space}
The vector space dimension is defined by the maximum amount of possible linear independent vectors in a vector space, which is then written as $\operatorname{dim}(S)$. The dimension can be finite or infinite~\cite{mathefuerInfoBand1}.
\subsection{Linear subspace}
A subspace $P \subseteq S$ of the vector space $S$ is itself a vector space if $P$ is complete with vector addition and multiplication with a scalar, so if for all $a, b \in P$ and every $s \in \mathbb{F}$
\begin{gather*}
a, b \in P \rightarrow a + b \in P\\
a \in P \rightarrow s a \in P
\end{gather*}
holds.
A linear combination of vectors $a_1, a_2, ... a_m \in S$ is the summation of the individual vector multiplications with arbitrary scalars $k_1, ... , k_m \in \mathbb{F}$
\begin{equation*}
\sum_{i=1}^{m} k_i a_i = k_1 a_1 + k_2 a_2 + \dots + k_m a_m
\end{equation*}
Furthermore vectors are linearly independent, if and only if the scalar values $k_1 = k_2 = \dots = k_m = 0$ are the only option to equate the linear combination of these vectors with zero.
\begin{gather}
k_1 a_1 + k_2 a_2 + \dots + k_m a_m = 0
\end{gather}
If this condition does not hold, the vectors are called linearly dependent, and the zero vector is by definition linear dependent~\cite{mathefuerInfoBand1}.
\subsection{Permutation matrix}
A permutation matrix $\boldsymbol{P} \in \mathds{R}^{n \times n}$ is defined by having in every row and every column precisely one entry set to 1 and all other entries set to 0.
For specific calculations, it may be desired to swap the position of entire rows and/or columns of a matrix. In order to achieve this operation, matrix multiplication can be performed with a permutation matrix $\boldsymbol{P}$ and the input matrix $\boldsymbol{M}$. If $\boldsymbol{P}$ is multiplied from the left or right, the rows or respectively, the columns of $\boldsymbol{M}$ are interchanged~\cite{liesenlineareAlgebra}.
Consider the following examples:
\begin{equation*}
M_{3\times 3} \times P_{3\times 3} =
\begin{pmatrix}
\textcolor{blue}{1} & 2 & \textcolor{red}{3}\\
\textcolor{blue}{5} & 8 & \textcolor{red}{13}\\
\textcolor{blue}{21} & 34 & \textcolor{red}{55}
\end{pmatrix}
%
\times
%
\begin{pmatrix}
0 & 0 & 1\\
0 & 1 & 0\\
1 & 0 & 0
\end{pmatrix}
=
\begin{pmatrix}
\textcolor{red}{3} & 2 & \textcolor{blue}{1}\\
\textcolor{red}{13} & 8 & \textcolor{blue}{5}\\
\textcolor{red}{55} & 34 & \textcolor{blue}{21}
\end{pmatrix},
\end{equation*}
\begin{equation*}
P_{3\times 3} \times M_{3\times 3} =
\begin{pmatrix}
0 & 0 & 1\\
0 & 1 & 0\\
1 & 0 & 0
\end{pmatrix}
%
\times
%
\begin{pmatrix}
\textcolor{blue}{1} & \textcolor{blue}{2} & \textcolor{blue}{3}\\
5 & 8 & 13\\
\textcolor{red}{21} & \textcolor{red}{34} & \textcolor{red}{55}
\end{pmatrix}
=
\begin{pmatrix}
\textcolor{red}{21} & \textcolor{red}{34} & \textcolor{red}{55}\\
5 & 8 & 13\\
\textcolor{blue}{1} & \textcolor{blue}{2} & \textcolor{blue}{3}
\end{pmatrix}.
\end{equation*}
\section{Coding theory}
\subsection{Code word}
A code word, in general, is a unit of information used in a
communication protocol. A single codeword consists of a vector $x = (x_{1},...,x_{k})$ with $k$ single symbol entries of an alphabet~\cite{mathefuerInfoBand1}. The alphabet used in this document consists of 0 and 1 because of the ease of implementation in current electronic systems. A code word consisting of only 0 and 1 is called a binary codeword. A code word with $q \geq 2$ symbols is called $q$-ary code word~\cite{jaygrossman}.
\subsection{Linear code}
A linear code is composed of codewords that form a vector space of $\mathbb{F}^{n}_{2}$ where $n$ depicts the dimension of the vector space. Binary linear codes are built up using binary code words and possess a length property \emph{k}. It is distinguished between $n$ and $k$ to provide an error-correction property. $n - k$ control bits are attached to a code word to result in an element of $\mathbb{F}^{n}_{2}$. In a so-called parity-check, these control bits can be later evaluated in order to check for transmission errors~\cite{mathefuerInfoBand1}. Binary linear codes and $q$-ary linear codes are defined analogously to code words. For the rest of the document, a linear code always refers to a binary linear code.
\subsection{Hamming weight}
A \emph{Hamming weight} of an element c $\in \mathbb{F}^{n}_{2}$ is defined to be the amount of symbols different from zero in this element c~\cite{bernsteinLange}.
Given the example codeword $\{01011\}$ with a total of 5 digits, this codeword has a hamming weight of 3 because overall, there are three digits different from zero.
\subsection{Hamming distance of codewords}
Considering two Codewords $c1, c2$ with $k > 0$ their \emph{hamming distance} is defined as the number of positions where the symbols are different from one another~\cite{bernsteinLange}.
When comparing the two codewords $\{ 1 \ \bold{1} \ \bold{0} \ \bold{0}\}$ and $\{ 1 \ \bold{0} \ \bold{1} \ \bold{1}\}$ we get a hamming distance of 3 because when comparing the last three digits they are not the same.
In a code $c$ with $n$ length and with dimension $k > 0$ its \emph{minimum distance} is defined as the smallest Hamming distance of two arbitrary nonzero element of the code $c$~\cite{bernsteinLange}. In the first step, the hamming distance between individual codewords is calculated, and subsequently, the minimum distance between them is selected.
\subsection{Error-detection and error-correction in code words}
Considering a code-word $\boldsymbol{c}$ with a minimum distance $\boldsymbol{d}$ then at most $\boldsymbol{d} - 1$ errors in $\boldsymbol{c}$ can be \textit{detected} and at most $\floor{\frac{\boldsymbol{d} - 1}{2}}$ errors in $\boldsymbol{c}$ can be \textit{corrected}~\cite{AbstractAlgebraRobinson}.
In the example binary code: $c1 = \{10010\}$, $c2 = \{01100\}$ and $c3 = \{10101\}$, we have a \textit{minimum distance} of 3, meaning we can detect a maximum of 2 errors and correct a maximum of 1 error.
Assume that a noisy code-word $x = \{10000\}$ is recieved, the error can be corrected ($\boldsymbol{d}(c1, x) = 1$) and yield back code-word $c1$. If the noisy code-word $y = \{11000\}$ is transmitted and two errors occurred, the distances are $\boldsymbol{d}(c1, y) = 2$, $\boldsymbol{d}(c2, y) = 2$ and $\boldsymbol{d}(c3, y) = 3$,
It is possible to detect that two errors have occurred. However, only one error can be corrected, and thus it is not possible to retrieve the original codeword~\cite{AbstractAlgebraRobinson}.
\subsection{Generator matrix}
A generator matrix $G$ can be formed from a linear code having the shape $n \times k$ and can be viewed as the identity matrix $\mathbb{I}_{k}$ and matrix $A$ of shape $k \times (n - k)$ as $G = (\mathbb{I}_{k} \mathbin\Vert A)$. A generator matrix can be brought into a canonical reduced echelon form using the Gaussian elimination algorithm~\cite{mathefuerInfoBand1}. An example is depicted below:
\begin{equation*}
G = \begin{pmatrix}
1 & 0 & \dots & 0 & a_{1,1} & \dots & a_{1,n-k}\\
0 & 1 & \dots & 0 & a_{2,1} & \dots & a_{2,n-k}\\
\vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \dots & 1 & a_{k,1} & \dots & a_{k,n-k}\\
\end{pmatrix}
\end{equation*}
The rows of this generator matrix can be viewed as unique linear codes consisting of unit vectors and the attached control bits. A codeword $\bold{c}$ can be obtained by multiplication of the transposed generator matrix with individual message blocks $x$
\begin{equation*}
\bold{c} = G^T x
\end{equation*}
A parity-check matrix is a matrix constructed from the generator matrix such that
\begin{equation*}
H = (A^T \ | \ \mathbb{I}_{n - k}).
\end{equation*}
If the parity-check matrix $H$ is now multiplied by the transposed generator matrix $G$, the result is zero.
This property can now be used to evaluate if a given codeword is an element of the linear code C ($\in C$) or not ($\centernot\in C$) if a given codeword multiplied with the parity-check matrix yields zero. As a result, the codeword is valid, and no errors have occurred during the transmission~\cite{mathefuerInfoBand1}.
\begin{equation*}
\begin{aligned}
H G^T = (A^T \ | \ \mathbb{I}_{n - k}) \colvec{2}{\mathbb{I}_{k}}{A^T} = A^T \mathbb{I}_{k} + \mathbb{I}_{n - k} A^T = A^T + A^T = 0 \\
H \bold{c} = H G^T \bold{x} = 0
\end{aligned}
\end{equation*}
% \textcolor{red}{1}
Let us consider a word $(1,0)$ and generator matrix $=\left(\begin{matrix}1 & 0\\ 0 & 1\\ 1 & 1\end{matrix}\right)$:
\begin{equation*}
\boldsymbol{c} \ = \ 1 \colvec{3}{1}{0}{1}
+ 0 \colvec{3}{0}{1}{1} = \colvec{3}{1}{0}{1}
\end{equation*}
If $( 1 \ 0 \ 1)$ is sent as a message, the receiver can use the parity-check matrix to check if errors are present in that message. The check can be achieved by multiplying the message vector with the parity-check matrix, applying to the result modulo 2. If the result equals zero, then no errors occurred during transmission. However, if the result is not equal to zero, then errors have occurred~\cite{mathefuerInfoBand1}.
\begin{equation*}
\begin{aligned}
\begin{pmatrix}
1 & 0 & 1
\end{pmatrix} \colvec{3}{1}{1}{{1}} = 0 \ mod \ 2\\
\end{aligned}
\end{equation*}
\subsection{Syndrome decoding problem}
% https://en.wikipedia.org/wiki/Decoding_methods#Syndrome_decoding
The hardness of the McEliece cryptosystem relies on syndrome decoding, which in general enables decoding a linear code transmitted via a noisy communication channel. Individual vectors in the following description are assumed as column vectors for simplicity in order to avoid explicit transposed markings. A linear binary code $C \in \mathbb{F}_{2}^{n}$ has the minimum distance $d$ and is of length $n$. There exists a parity-check matrix $H \in \mathbb{F}_{2}^{(n-k) \times n}$ for the code $C$. According to section $2.2.7$, the error correction capacity of $C$ is $t = \floor{\frac{\boldsymbol{d} - 1}{2}}$, meaning a maximum of $t$-errors introduced by the communication channel can be corrected. If codeword $\boldsymbol{x} \in C$ is sent over the channel and an error-vector $e \in \mathbb{F}_{2}^{n}$ is produced and sent with the codeword, $y = \boldsymbol{x} + e$ is obtained as the recieved message. In this specific setting, the hamming weight of $e$ is $wt(e) \leq t$ in order to guarantee unique decoding of $\boldsymbol{x}$. The actual $\textit{syndrome}$ $S(\boldsymbol{x})$ is defined as:
\begin{gather*}
S(\boldsymbol{x}) = Hy = H(\boldsymbol{x} + e) = H\boldsymbol{x} + He = 0 + He\\
\end{gather*}
The term $H\boldsymbol{x}$ would evaluate to $0$ if a valid codeword $\boldsymbol{x}$ was provided, and thus the syndrome is only dependent on the error vector $e$. Once $e$ has been recovered, the original codeword $x$ can be extracted via subtraction of the error-vector $e$ from $y$~\cite{AbstractAlgebraRobinson, understandingGoppaDecoding}. In order to recover $e$ from $S(x)$, different decoding algorithms are published in the literature. Concerning the asymptotic computational complexity, for instance, the ball collision algorithm published by Bernstein et al.~\cite{ballCollision} yields $2^{0.05559n}$. An improved algorithm by Becker et al. exhibits complexity $2^{0.0494n}$~\cite{beckerJouxMay}.
\subsection{Binary Goppa codes}
Binary goppa codes are a subclass of goppa codes and are the crucial building block in the McEliece cryptosystem. To the current knowledge, they exhibit the property of being efficiently decodable with the insight of an error-correcting algorithm $\mathcal{D}$. This results in the desired trapdoor functionality by selecting private $\mathcal{D}$ matching a generator matrix constructed by a chosen binary goppa code~\cite{engelbertOverbeckSchmidt, goppaVD}. A binary goppa code is defined by the \textit{goppa polynomial} $g(X)$ over $\mathbb{F}_{2^{m}}$:
\begin{equation*}
g(X) = \sum_{i=0}^{t} g_{i}X^{i} \in \mathbb{F}_{2^{m}}
\end{equation*}
and a list of $n$ distinct elements $L$:
\begin{equation*}
L = (a_{0},...,a_{n-1}) \in \mathbb{F}_{2^{m}}^{n}
\end{equation*}
where the condition $g(a_{i}) \neq 0$ for all $i$ elements holds. The binary goppa code is denoted as $\mathcal{G}(L,g(X))$ with the set of vectors $c = (c_{0},...,c_{n-1}) \in \mathbb{F}_{2}^{n}$ (meaning the entries are either zeros or ones) such that the identity property holds~\cite{engelbertOverbeckSchmidt}. The \textit{syndrome} of $c$ is defined as:
\begin{gather*}
S(c) = \sum_{i=0}^{n-1} \frac{c_{i}}{X - a_{i}} \ \text{mod} \ g(X).
\end{gather*}
Binary goppa codes discussed in this thesis are irreducible over $\mathbb{F}_{2^{m}}$. A binary goppa code of degree $t$ and length $n = 2^{m}$ has a $k \geq n - mt$ dimensionality and is able to correct $t$ or less errors~\cite{robertmceliece}. A polynomial $p(x)$ is \textit{irreducible} with degree larger than 1, denotes that there is no other polynomial $q(x)$ with degree larger than 0 and smaller than degree of $p(x)$ which divides $p(x)$~\cite{mathefuerInfoBand1}.
A parity-check matrix $H$ can be constructed, which is subsequently used to form a generator matrix. Note that matrix A is in the lower triangular form filled with individual goppa polynomials $g_{t}$, the matrix B is in a transposed Vandermonde matrix form~\cite{liesenlineareAlgebra} containing distinct elements $a_{i}$ and matrix C is a diagonal matrix of goppa polynomials using distinct elements as input.
\begin{equation*}
H = ABC = \begin{pmatrix}
g_{t} & 0 & 0 &\dots & 0\\
g_{t-1} & g_{t} & 0 & \dots & 0\\
\vdots & \vdots & \vdots & \ddots & \vdots\\
g_{1} & g_{2} & g_{3} & \dots & g_{t}\\
\end{pmatrix}
%
\times
%
\begin{pmatrix}
1 & 1 & \dots & 1\\
a_{0} & a_{1} & \dots & a_{n-1}\\
\vdots & \vdots & \ddots & \vdots\\
a_{0}^{t-1} & a_{1}^{t-1} & \dots & a_{n-1}^{t-1}\\
\end{pmatrix}
%
\times
%
\begin{pmatrix}
\frac{1}{g(a_{0})} & \cdots & \cdots & 0\\
\vdots & \frac{1}{g(a_{1})} & \reflectbox{$\ddots$} & \vdots\\
\vdots & \reflectbox{$\ddots$} & \ddots & \vdots\\
0 & \cdots & \cdots & \frac{1}{g(a_{n-1})}\\
\end{pmatrix}
\end{equation*}
\begin{equation*}
= \begin{pmatrix}
g_{t} g(a_{0})^{-1} & \dots & g_{t} g(a_{n-1})^{-1}\\
(g_{t-1} + g_{t} a_{0})g(a_{0})^{-1} & \dots & (g_{t-1} + g_{t} a_{n-1})g(a_{n-1})^{-1}\\
\vdots & \ddots & \vdots\\
(\sum_{j=1}^{t} g_{j} a_{0}^{j-1})g(a_{0})^{-1} & \dots & (\sum_{j=1}^{t} g_{j} a_{n-1}^{j-1})g(a_{n-1})^{-1}\\
\end{pmatrix}
\end{equation*}
\subsubsection{Efficient decoding of irreducible binary Goppa codes}
In order to efficiently decode a goppa code, various algorithms have been published up to date. In the context of this thesis, the description of decoding binary goppa codes shall focus on the decryption routine used in the classic McEliece cryptosystem. In order to read about other goppa decoding algorithms, the recently published resource "Understanding binary-Goppa decoding" by Daniel J. Bernstein is recommended~\cite{understandingGoppaDecoding}. The minimum distance of a goppa code generated by an irreducible polynomial of degree $t$ is at least $2t + 1$, and the maximum amount of correctable errors is $t$~\cite{engelbertOverbeckSchmidt}.
The high-level description of the classic McEliece decryption implementation follows:
A triplet consisting of the calculated syndrome, an authentification tag, and the output of the stream cipher is sent to a receiving party. In order to read a message, the receiver initially has to decode the syndrome via the Niederreiter secret key and obtains an error vector. The error vector gets hashed in the next step, and two symmetric keys are obtained. First, the receiving party uses the authentification tag to verify and decrypt the stream-cipher output. In the authentification as well as the decoding procedure aborts if a tampered message is recieved~\cite{McBits}.
The decoding routine consists of 5 consecutive stages: starting permutation, computing the syndrome, key-equation solving, finding the root and concluding permutation. For the key-equation solving step the Berlekamp-Massey algorithm was implemented which shall now be explained in a little detail. At the start of the algorithm polynomials $\sigma(x) = 1, \beta(x) = x \in \mathbb{F}_{2^{m}}, \delta = 1 \in \mathbb{F}_{2^{m}}$ and $\ell = 0 \in \mathbb{Z}$ are initialized. The term $S(x) = \sum_{i=0}^{2t-1} S_{i}x^{i}$ is defined at the input syndrome.
Next, in iterations, $k$ starting from $0$ up to $2t - 1$, the input variables are recalculated as is shown in Figure 2.1. The remaining substeps are described in more detail in the paper "McBits Revisited" by Tung Chou~\cite{McBits}.
\begin{figure}[H]
\begin{gather*}
d \leftarrow \sum_{i=0}^{t} \sigma_{i} S_{k-i} \\
{[\sigma(x), \beta(x), \ell, \delta] \leftarrow\left\{\begin{array}{l}
{[\delta \sigma(x)-d \beta(x), \ x \beta(x), \ \ell, \ \delta], \ d=0 \text { or } k<2 \ell} \\
{[\delta \sigma(x)-d \beta(x), \ x \sigma(x), \ k-\ell+1, \ d], \text { otherwise. }}
\end{array}\right.}
\end{gather*}
\caption{Inversion-free Berlekamp-Massey algorithm for key-equation solving~\cite{McBits}.}
\end{figure}
\section{Key encapsulation mechanism (KEM)}
As a means to achieve encryption using public-key cryptography of messages with arbitrary length, a key encapsulation mechanism (KEM) is used in order to securely transmit a symmetric key between two parties~\cite{Coretti2013}. Once the transfer operation has been completed, the symmetric key is further used to encrypt the message by the receiving party.
A KEM consists as a triplet of the algorithms $\mathcal{K} = (KeyGen, Encapsulation, Decapsulation)$ which are described below.
\begin{equation*}
1. \ \ \text{KeyGen()} \rightarrow (pk, sk)\\
\end{equation*}
The KeyGen algorithm produces a pair of public-key and secret-key as output and incorporates randomness in the process (probabilistic).
\begin{equation*}
2. \ \ \text{Encapsulation($pk$)} \rightarrow (c, k)\\
\end{equation*}
In the probabilistic encapsulation algorithm, the previously generated public-key is taken as input and the output is a ciphertext $c$, aswell as key $k \in K$.
\begin{equation*}
3. \ \ \text{Decapsulation($sk, c$)} \rightarrow (k)\\
\end{equation*}
Finally, the decapsulation algorithm is itself deterministic and takes the secret key and ciphertext as input and returns a key $k \in K$ or an error symbol $\bot$~\cite{saitoYamakawa, BindelBrendlFischlin}.
\section{The McEliece cryptosystem}
\label{sec:mceliece_crypto}
\subsection{Key generation}
% listings in pseudocode schreiben siehe, specification
% beschreibung der subroutinen
Concerning the key-generation, the initial parameters $n, t \in \mathbb{N}$ are chosen, where $t < n$. In the next step, a random error-correcting binary goppa code $\mathcal{G}$ is selected, which can correct up to $t$ errors.
The private key consists of an $n \times k$ generator matrix $G$, constructed from a chosen random error-correcting binary goppa code, a random binary non-singular matrix $S$, a random $n \times n$ permutation matrix $P$ and an efficient binary goppa code decoding algorithm $D$, resulting in the set $(G, S, P, D)$. The public key $G^{\prime}$ is subsequently created by matrix multiplication $G^{\prime} = SGP$ resulting in the set $(G, t)$. The security assumption relies on the NP-hardness (nondeterministic polynomial time) of decoding random linear codes and the random shuffle of the generator matrix by matrix multiplication with $S$ and $P$~\cite{robertmceliece, NP_berlekamp, bernsteinLange, engelbertOverbeckSchmidt}.
\subsection{Message encryption}
A plaintext message $m \in \{0,1\}^k$ and an error vector $e \in \{0,1\}^n$ of hamming weight $t$ are selected. The ciphertext $\bold{c}$ is calculated by multiplying the message vector with the public-key followed by addition of the error vector.
\begin{equation*}
\bold{c} = m \underbrace{SGP}_{G^{\prime}} + e\\
\end{equation*}
\subsection{Message decryption}
When decrypting a recieved ciphertext $\bold{c}$ to back to the plaintext message $m$, in a first step the inverse permutation operation is applied on $\bold{c}$. Multiplying the parity-check matrix to the result yields the syndrome $He^{\prime}$ which can be decoded to $e^{\prime}$ and thus remove the error vector to recover at $c^{\prime}$.
\begin{gather*}
\bold{c} P^{-1} = \underbrace{mS}_{m^{\prime}}G + e P^{-1}\\
= \underbrace{m^{\prime} G}_{c^{\prime}} + \underbrace{e P^{-1}}_{e^{\prime}}\\
= c^{\prime} + e^{\prime}\\
H(cP^{-1}) = H(c^{\prime} + e^{\prime}) = \underbrace{Hc^{\prime}}_{= 0} + He^{\prime}\\
\end{gather*}
Next since $e P^{-1}$ has hamming weight $t$ and $mSG$ is a codeword, the private decoding algorithm $D$ can be applied to obtain $mS$. Now a simple multiplication by the inverse of matrix $S$ yields the plaintext message $m$.
\begin{gather*}
D(mSG) = mS \\
m = mS S^{-1} \\
\end{gather*}
\subsection{Niederreiter variation}
In 1986,~\cite{niederreiter} H. Niederreiter published a knapsack variation of the McEliece cryptosystem. As a prerequisite the system parameters $n, t \in \mathbb{N}$ are chosen and the public key $H^{\prime}$ is the product of matrix multiplication:
\begin{align*}
\text{parity-check matrix } & H \in \mathbb{N}^{(n-k) \times n} \\
\text{binary random non-singular } & M \in \mathbb{N}^{(n-k) \times (n-k)}\\
\text{random permutation matrix } & P \in \mathbb{N}^{n \times n}\\
\text{public key } & H^{\prime} = MHP \in \mathbb{N}^{(n-k) \times n}\\
\end{align*}
An efficient syndrome decoding algorithm $D$ has to be chosen for the private key and is then kept as the private the set $(P, D, M)$. A plaintext message $m \in \{0,1\}^n$ with hamming weight $t$ is encrypted by calculating the syndrome:
\begin{equation*}
s = H^{\prime}m^{T}
\end{equation*}
and as a means to decrypt the syndrome first the inverse matrix $M$ is multiplied with $s$. In the next step, the syndrome decoding algorithm is employed in order to yield $Pm^{T}$ and subsequently by calculating the inverse permutation matrix the plaintext message $m$ is finally obtained~\cite{engelbertOverbeckSchmidt}.
\begin{gather*}
M^{-1}s = HPm^{T}\\
D(M^{-1}s) = Pm^{T}\\
m^{T} = P^{-1}Pm^{T}\\
\end{gather*}
In the NIST Classic McEliece reference implementation, the Niederreiter encryption and decryption are implemented~\cite{McBits}.
\subsection{Improved efficiency of Classic McEliece KEM submission}
The authors of the round 3 Classic McEliece submission state that the efficiency of creating the public key generator matrix $G$ for a linear code $C$ was enhanced by choosing it to be in a unique systematic form. As a consequence the generator matrix takes the form $\left( \frac{T}{I_{k}} \right)$ where $T$ has dimensions $(n - k) \times k$ and $I_{k}$ denotes the identity matrix with dimensions $k \times k$. The corresponding parity-check matrix $H$ is of form $(I_{n-k} \ | \ T)$. About 29\% of choices of $C$ exhibit this systematic form where 3.4 key-generation attempts are needed on average. The advantage of choosing such a generator matrix form is that the public key size in bits could be reduced to $k(n-k)$ instead of $(kn)$. A further improvement of key-generation could be achieved by using a semi-systematic matrix form~\cite{classicMcElieceSubmission3}.
\subsubsection{Semi-systematic form matrix}
Assume two integer parameters $\mu, \nu$ where $\nu \geq \mu \geq 0$. Matrix $M$ is of rank $r$ and in reduced row-echolon form. It is assumed that $r \geq \mu$ and at least $r-\mu+\nu$ columns are present.
$M$ is then in $(\mu, \nu)$ semi-systematic form, if $r$ rows exist in $M$ and columns $c_{i} = i$ for $1 \leq i \leq r-\mu$ and $c_{i} \leq i-\mu+\nu$ for $1 \leq i \leq r$.
Columns are set $c_{1} = 1, c_{2} = 2$ up until $c_{n-k-\mu} = n-k-\mu$ and $n-k-\mu < c_{n-k-\mu+1} < c_{n-k-\mu+2} < ... < c_{n-k} \leq n-k-\mu-\nu$~\cite{classicMcElieceSubmission3}.
The authors estimated that for semi-systematic form choosing the parameters $(\mu, \nu) = (32, 64)$ the probability of failure for each key-generation attempt could be narrowed below $2^{-30}$, needing only one key-generation attempt in most cases~\cite{classicMcElieceSubmission3}.
\pagebreak
\subsection{Classic McEliece KEM}
The depicted algorithms and subroutines were taken from the classic McEliece round three submission~\cite{classicMcElieceSubmission3}.
\subsubsection{Notation}
$\ell$ - the output-length of a cryptographic hash function\\
$G$ - pseudorandom bit generator function\\
$H$ - cryptographic hash function\\
$\Upgamma^{\prime}$ - $(g, \alpha^{\prime}_{1}, \alpha^{\prime}_{2},..., \alpha^{\prime}_{i})$\\
$T$ - a $(n-k) \times k$ matrix in $\mathbb{F}_{2}$ where a parity-check matrix $H = (I_{n-k} \ | \ T)$ exists\\
$s$ - bit string of the code length $n$\\
$g$ - monic irreducible polynomial of finite field $\mathbb{F}_{q}[x]$\\
$\alpha_{i}$ - element of finite field $\mathbb{F}_{q}$\\
$\Upgamma$ - $(g, \alpha_{1}, \alpha_{2},..., \alpha_{i})$ goppa code with dimension $k = n - mt$ and length property $n$\\
$\mu$ - non-negative integer\\
$\nu$ - non-negative integer\\
$n$ - code length\\
$k$ - code dimension\\
$t$ - error-correction ability\\
$q$ - field size\\
$m$ - $\log_{2}q$
In the FixedWeight algorithm a precomputed integer $\tau \leq t$ is defined \\
as $t$ if $n = q$;\\
as $2t$ if $q/2 \leq n < q$;\\
as $4t$ if $q/4 \leq n < q/2$;\\
etc.
In the decode algorithm the syndrome decoding problem is applied. The bit string $C_{0} \in \mathbb{F}^{n-k}_{2}$ is decoded to error-vector $e$ with hamming weight $t$ if $C_{0} = He$.
%genpoly_gen() corresponds to the FieldOrdering & Irreducible algorithm%
\begin{algorithm}[H]
\caption{MatGen (systematic form)}\label{alg:MatGen}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{$\Upgamma$ - $(g, \alpha_{1}, \alpha_{2},..., \alpha_{n})$, integer parameters $(\mu, \nu) = (0,0)$}
\Output{Either $\bot$ or ($T$,...) and $T$ represents the public key: a $(n-k)\times k$ matrix over $\mathbb{F}_{2}$}
Initialize $t \times n$ matrix $\Tilde{H} = \{h_{i,j}\} \in \mathbb{F}_{2}$\;
\For{$i\leftarrow 1$ \KwTo $t$}{
\For{$j\leftarrow 1$ \KwTo $n$}{
$h_{i,j} \gets \alpha^{i-1}_{j}/g(\alpha_{j})$\;
}
}
Each entry $u_{0} + u_{1}z + ...+ u_{m-1}z^{m-1}$ in $\Tilde{H}$ is replaced with column of $m$ bits $u_{0},u_{1},...,u_{m-1}$ to form $mt \times n$ matrix $\Hat{H} \in \mathbb{F}_{2}$\;
$\Hat{H}$ gets reduced to systematic form $(I_{n-k} \ | \ T)$ where $I_{n-k}$ represents a $(n-k) \times (n-k)$ identity matrix. In case the reduction fails return $\bot$\;
Return $(T, \Upgamma)$\;
\end{algorithm}
\begin{algorithm}[H]
\caption{MatGen (semi-systematic form)}\label{alg:MatGen-semi}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{$\Upgamma$ - $(g, \alpha_{1}, \alpha_{2},..., \alpha_{n})$, general $(\mu, \nu)$}
\Output{Either $\bot$ or ($(T, c_{n-k-\mu+1},...,c_{n-k}, \Upgamma^{\prime})$) $T$ represents the public key: a $(n-k)\times k$ matrix over $\mathbb{F}_{2}$} and $\Upgamma^{\prime}$ represents a goppa code
Initialize $t \times n$ matrix $\Tilde{H} = \{h_{i,j}\} \in \mathbb{F}_{2}$\;
\For{$i\leftarrow 1$ \KwTo $t$}{
\For{$j\leftarrow 1$ \KwTo $n$}{
$h_{i,j} \gets \alpha^{i-1}_{j}/g(\alpha_{j})$\;
}
}
Each entry $u_{0} + u_{1}z + ...+ u_{m-1}z^{m-1}$ in $\Tilde{H}$ is replaced with column of $m$ bits $u_{0},u_{1},...,u_{m-1}$ to form $mt \times n$ matrix $\Hat{H} \in \mathbb{F}_{2}$\;
$\Hat{H}$ gets reduced to $(\mu, \nu)$ semi-systematic form $H$ where $I_{n-k}$ represents a $(n-k) \times (n-k)$ identity matrix. In case the reduction fails return $\bot$\;
$(\alpha^{\prime}_{1}, \alpha^{\prime}_{2},..., \alpha^{\prime}_{i}) \gets (\alpha_{1}, \alpha_{2},..., \alpha_{i})$\;
\For{$i \leftarrow n-k-\mu+1$ \KwTo $i = n-k$}{
\If{$c_{i}$ == $i$}{
continue\;
}
swap the $i$-th column with the $c_{i}$ column in $H$\;
}
$H$ is now in systematic form $(I_{n-k} \ | \ T)$ where $I_{n-k}$ represents a $(n-k) \times (n-k)$ identity matrix\;
Return $(T, c_{n-k-\mu+1},...,c_{n-k}, \Upgamma^{\prime})$\;
\end{algorithm}
\begin{algorithm}[H]
\caption{Irreducible-polynomial generation}\label{alg:irreducible}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{a string of $\sigma_{1}t$ bits $d_{0},d_{1},...,d_{\sigma_{1}t-1}$}
\Output{monic irreducible polynomial $g \in \mathbb{F}_{q}$ or $\bot$}
$\beta_{j} \gets \sum^{m-1}_{i=0} d_{\sigma_{1}j+i}z^{i}$ for each $j \in {0,1,...,t-1}$\;
$\beta \gets \beta_{0} + \beta_{1}y + ... + \beta_{t-1}y^{t-1} \in \mathbb{F}_{q} [y] / F(y)$\;
Calculate minimal polynomial $g$ of $\beta$ over $\mathbb{F}_{q}$\;
\eIf{degree of $g == t$}{
Return $g$\;
}{
Return $\bot$\;
}
\end{algorithm}
\begin{algorithm}[H]
\caption{FieldOrdering}\label{alg:fieldOrdering}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{$\sigma_{2}q$ bit string}
\Output{sequence $(\alpha_{1}, \alpha_{2},..., \alpha_{q}) \in \mathbb{F}_{2}$ q unique elements or $\bot$}
First $\sigma_{2}q$ bits as $\sigma_{2}q$-bit int $a_{0} \gets b_{0}+2b_{1}+ ... + 2^{\sigma_{2}-1}b_{\sigma_{2}-1}$, the next $\sigma_{2}q$ bits are $a_{1}$ $\sigma_{2}q$-bit int, up until $a_{q-1}$\;
\If{$a_{0},a_{1},...,a_{q-1}$ != distinct}{
Return $\bot$\;
}
$(a_{\pi(i)}, \pi(i)) \gets (a_{i},i)$ pairs sorted in lexicographic order \Comment*[r]{$\pi$ as permutation of \{0,1,...,q-1\}}
$\alpha_{i+1} \gets \sum_{j=0}^{m-1} \underbrace{\pi(i)_{j}}_{j-th \ least \ significant \ bit \ of \ \pi(i)} \times \ z^{m-1-j}$\;
Return $(\alpha_{0},\alpha_{1},...,\alpha_{q})$
\end{algorithm}
\begin{algorithm}[H]
\caption{Classic McEliece - Key generation}\label{alg:two}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{none /* corresponds to crypto\_kem\_keypair(pk, sk, rng) in the Rust implementation */}
\Output{public key (pk) and private key (sk)}
$\delta \gets \text{uniform random } \ell \text{-bit string as seed}$\;
$E \gets G(\delta) \text{ which resembles a string containing } n+\sigma_{2}q + \sigma_{1}t + \ell \text{ bits}$\;
$\delta^{\prime} \gets \text{the last } \ell \text{ bits of } E$\;
$s \gets \text{the first $n$ bits of $E$}$\;
$(\alpha_{1},..., \alpha_{q}) \gets$ of next $\sigma_{2}q$ bits from $E$ using FieldOrdering\;
\If{$\bot$ == FieldOrdering($\sigma_{2}q$)}{
$\delta \gets \delta^{\prime}$ and restart the algorithm;
}
$g \gets $ of next $\sigma_{1}t$ bits from $E$ using Irreducible-polynomial generator algorithm\;
\If{$\bot$ == Irreducible($\sigma_{1}t$)}{
$\delta \gets \delta^{\prime}$ and restart the algorithm;
}
$\Upgamma \gets (g, \alpha_{1}, \alpha_{2},..., \alpha_{n})$ \Comment*[r]{$\alpha_{n+1}$ until $\alpha_{q}$ are not applied for $\Upgamma$}\;
$(T, c_{n-k-\mu+1},...,c_{n-k}, \Upgamma^{\prime}) \gets$ MatGen($\Upgamma$)\;
\If{$\bot$ == MatGen( $\Upgamma$)}{
$\delta \gets \delta^{\prime}$ and restart the algorithm;
}
$\Upgamma^{\prime} \gets (g, \alpha^{\prime}_{1}, \alpha^{\prime}_{2},..., \alpha^{\prime}_{n})$\;
Return $T$ the public key \& $(\delta, c, g, \alpha, s)$ the private key, where $c=(c_{n-k-\mu+1},...,c_{n-k}$ and $\alpha=(\alpha^{\prime}_{1},...,\alpha^{\prime}_{n}, \alpha^{\prime}_{n+1},...,\alpha_{q})$\;
\end{algorithm}
\begin{algorithm}[H]
\caption{FixedWeight}\label{alg:fixedweight}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{none}
\Output{error-vector $e \in \mathbb{F}^{n}_{2}$ of weight t}
$\sigma_{1}\tau \text{ uniform random bits $b_{0},b_{1},...,b_{\sigma_{1}\tau - 1}$ are generated}$\;
\For{$j\leftarrow 0$ \KwTo $\tau-1$}{
$d_{j} \gets \sum_{i=0}^{m-1} b_{\sigma_{1}j+i}2^{i}$\;
}
$(a_{0},a_{1},...,a_{t - 1}) \gets \text{the first $t$ entries in $(d_{0},d_{1},...,d_{\tau - 1})$ in the range ${0,1,...,n-1}$}$\;
\If{$\text{entries} < t$}{
\text{restart algorithm}\;
}
\For{$i \leftarrow 0$ \KwTo $t-1$}{
\If{$a_{i}$ != distinct}{
\text{restart algorithm}\;
}
}
$e \gets (e_{0},e_{1},...,e_{t - 1} \in \mathbb{F}^{n}_{2}) \text{ error-vector of weight $t$ is defined that $e_{a_{i}} = 1$ for each $i$}$\;
Return $e$\;
\end{algorithm}
\begin{algorithm}[H]
\caption{Classic McEliece - Encapsulation}\label{alg:encap}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{public key $T$ as $(n-k) \times k$ matrix $\in \mathbb{F}_{2}$}
\Output{ciphertext C \& session key K}
$e \gets$ FixedWeight() \Comment*[r]{column vector $e \in \mathbb{F}^{n}_{2}$ of weight t}
$H \gets (I_{n-k} \ | \ T)$\;
$C_{0} \gets He \in \mathbb{F}^{n-k}_{2}$\;
$C_{1} \gets H(2,e)$\;
$C \gets (C_{0},C_{1})$\;
$K \gets H(1,e,C)$\;
Return $C$ and $K$\;
\end{algorithm}
\begin{algorithm}[H]
\caption{Decode}\label{alg:decode}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{$C_{0} \in \mathbb{F}^{n-k}_{2}$ and $\Upgamma^{\prime}$ - $(g, \alpha^{\prime}_{1}, \alpha^{\prime}_{2},..., \alpha^{\prime}_{i})$}
\Output{error-vector $e$ or failure $\bot$}
Expand $C_{0}$ to $v = (C_{0},0...,0) \in \mathbb{F}^{n}_{2}$ via addition of $k$ zeros\;
Detect unique codeword $c$ with distance $\leq t$ from $v$ in goppa-code $\Upgamma^{\prime}$\;
\If{no codeword c found}{
Return $\bot$\;
}
$e \gets v + c$\;
\eIf{$wt(e) == t$ and $C_{0} = He$}{
Return $e$\;
}{
Return $\bot$\;
}
\end{algorithm}
\begin{algorithm}[H]
\caption{Classic McEliece - Decapsulation}\label{alg:decap}
\SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
\Input{ciphertext C and private Key}
\Output{session key K}
Unravel $C$ to $(C_{0}, C_{1})$ as $C_{0} \in \mathbb{F}^{n-k}_{2} \ \& \ C_{1} \in \mathbb{F}^{\ell}_{2}$\;
$b \gets 1$\;
Retrieve $s \in \mathbb{F}^{n}_{2}$ \& $\Upgamma^{\prime} = (g, \alpha^{\prime}_{1}, \alpha^{\prime}_{2},..., \alpha^{\prime}_{i})$ from private key\;
$e \gets $ Decode($C_{0}, \Upgamma^{\prime}$)\;
\If{$e == \bot$}{
$e \gets s$\;
$b \gets 0$\;
}
$C^{\prime}_{1} = H(2,e)$\;
\If{$C^{\prime}_{1}$ != $C_{1}$}{
$e \gets s$\;
$b \gets 0$\;
}
$K \gets H(b,e,C)$\;
Return $K$\;
\end{algorithm}
\pagebreak
\section{Rust programming language}
\label{sec:rust_programming_lang}
The Rust language emerged primarily focusing on memory safety and safe concurrent programming while trying to be efficient in execution time. In this context, "safety" refers to the absence of undefined program behavior. Undefined behavior can be triggered by fault categories such as use-after-free, out-of-bounds read or write operations, and race conditions. A brief primer on concurrency and memory safety in computer programming follows.
\subsection{Concurrency}
In concurrent program-flow execution, different operations independent of each other can be executed at any given time and out-of-order. This execution scheme can be overlapped by first processing Task\_1 and then performing a task switch to a second Task\_2. Since the work gets processed sequentially, it differs from true parallel program flow as the tasks are processed simultaneously. Concurrency in programs is introduced to achieve more workload simultaneously (increase performance) and as a means of separation of concerns (discriminating different functionality). Unfortunately, the flawless implementation of concurrent programs is impeded due to race conditions. A race condition transpires when two or more execution threads attempt to perform read/write actions on the same data location and at the same time. Another prerequisite for a race condition is that threads attempt to \textbf{modify} data because different ordered read operations do not affect each other's outcomes~\cite{cppConcurrency, DBLP:journals/corr/abs-1904-12210}. Since the Classic McEliece implementation in Rust does not use multithreading, further discussion of the concurrency topic will be omitted, but can be found in the textbook "C++ concurrency in action"~\cite{cppConcurrency} or more Rust specific, in the publication "A Practical Analysis of Rust's Concurrency Story"~\cite{DBLP:journals/corr/abs-1904-12210}.
\subsection{Memory safety}
Runtime errors in software are common and cause numerous problems, ranging from annoyances during user operation to the complete failure of critical digital infrastructure. Typically in low-level programming languages such as C/C++, where the programmer has high control of memory management, such errors occur frequently. More specifically, the errors which are the most critical can be summarized as memory safety bugs~\cite{mitreTop25CVE}. This category is further partitioned into errors that arise from invalid pointer operations or out-of-bounds memory access. In an out-of-bounds memory access operation, an adjacent region of memory outside the allocated space of a program is read or overwritten. Such an anomaly may emerge by invalid checking buffer boundaries or a scope within bounds but with a mismatched unit type size. Further, two types of unauthorized access are documented. A "buffer over-read" where out-of-bounds memory is read and a "buffer overflow" where additional pieces of memory are overwritten~\cite{memsafetyHui}.
Concerning the category of invalid pointer operations, a pointer is a data object that holds the value of (or "points" to) a specific memory address. A dangling pointer refers to a memory region marked deallocated or reused. So-called "use-after-free" errors arise on a write/read attempt of the value of a dangling pointer.
These software errors may be abused for malicious activities by attackers once they have exploited a vulnerability. This is possible because the memory layout is explicitly defined in operating systems and program code. An example of such a procedure could be the following. An attacker finds the possibility to manipulate a program's memory in a way that is not intended by abusing one of the previously described software vulnerabilities. In the attack step, a carefully crafted payload is given to the program as input and subsequently exchanges valid executable code in a memory region with malicious code or reads sensitive data from a memory segment not belonging to the program. As a means to mitigate malicious memory attacks, prevention mechanisms such as stack canaries (before a function returns a clandestine value placed at the stack boundary is checked for modification. If it has been overwritten, the program terminates at once) and ASLR (address space layout randomization) both for user and kernel space (KASLR) have been implemented. However, such remedies only raise the workload for the attacker and cannot wholly prevent memory safety errors from being abused. Stack canary values can be leaked, and ASLR has been broken, e.g., due to side-channel leakage~\cite{KASLRbreak, shadowAndStackCanary, memsafetyHui}. The security vulnerability "heartbleed"~\cite{heartbleed} is an example of a severe memory safety bug in the cryptographic library OpenSSL, which was estimated to affect approximately 24-55\% of hosted HTTPS websites. The root cause of the error was a missing buffer bounds check. The attacker could specify a request message and an associated length. However, the actual length number was not checked to be matching the length of the request message word. For instance, a small payload message such as "test" with an actual length of 4 characters combined with a length of 600 would trigger an out-of-bounds read operation that leaks private information from another memory segment.
\subsection{Memory safety concepts in Rust}
The Rust programming language tries to take an approach to memory safety by enforcing strict semantic checks at compile time to determine possible memory safety violations. The compilation process is aborted with appropriate log messages describing the error if any are found.
\subsubsection{Ownership, Lifetimes, Borrowing}
As one of the main features in the Rust language, the concept of ownership represents a collection of rules of how a compiled Rust program manages memory operations. In contrast to performance impeding runtime checks employed in memory management techniques such as garbage collection or reference counting, the Rust compiler checks the ownership rules during compile time. As a consequence, Rust programs exhibit comparatively fast runtimes. The basic rules of ownership are as follows~\cite{rustBook, rustForRustaceans}:
\begin{itemize}
\item Every value has a single designated "owner" variable which has the permission to perform read/write operations on the memory location
\item The "owner" of a value can be "moved" to a new variable (for instance, by assigning to a new variable) however, there can only be one single owner at any given time.
\item Freeing the resources that values have acquired takes place once the variable that owns the value leaves the current scope.
\end{itemize}
Moreover, it is possible to borrow the ownership of a value by using (shared) references among variables. A \textit{reference} can be thought of as a "label" that represents a value stored at a particular address, of which a designated variable holds the ownership. Moreover, a reference ensures to point to a valid value of a specific type. The variable becomes immutable in a shared reference scenario, restricting access via read-only operations. Therefore, Rust provides the choice of either having multiple aliases to a value but then writing modifications of it are not allowed or having exclusive ownership of a value by one single variable with the privilege of write and read operations.
Another concept that is closely related to borrowing is the \textit{lifetime} of variables or references. Generally, a lifetime specifies a code section where a variable or reference is guaranteed to be valid. Lifetimes start upon the declaration of a reference and cease to exist once the reference goes out of scope, the reference gets modified, or the variable gets \textit{moved} to another variable. A move corresponds to the operations of first copying a pointer to some memory address and its associated metadata (length, capacity), followed by invalidating the previous pointer variable.
A so-called borrow checker algorithm is implemented to ensure the lifetime of a reference in Rust. When the code execution reaches a point where a reference is used,
the associated lifetime is examined for validity. This check works by carefully
tracing back from the region of use to the start of the lifetime and verifying that
there are no conflicting usages among other variables. As a consequence, after the check, the reference should be guaranteed to point to a valid, accessible value~\cite{rustBook, rustForRustaceans, memsafetyHui}.
\begin{listing}[H]
\begin{minted}{rust}
let mut x = Box::new(30);
let mut z = &x; // (1) lifetime of reference to x starts
if 10 > 2 {
*x = 15; // (2) modification of x
} else {
println!("{}", z); // (3) reading reference to x
}
// (4) lifetime ends
\end{minted}
\caption{Example code to illustrate the range of validity of lifetimes in Rust~\cite{rustForRustaceans}.}
\label{fig:RustLifetime}
\end{listing}
In code listing 1, first a new value is allocated on the heap with \inlinecode{Box::new(30)} and gets assigned to the variable \inlinecode{x}. Upon taking a reference to \inlinecode{x} via \inlinecode{(1)}, the lifetime begins and does only reach into the else execution branch. Following the program flow, we have two branches. If the specified condition is true, the variable \inlinecode{x} gets dereferenced and modified, which requires a mutable reference. This modification is allowed since the borrow checker recognizes that afterward \inlinecode{z} is never being used, and therefore no conflict arises. When examining the else branch of the execution flow, the borrow checker realizes a usage of \inlinecode{z} at \inlinecode{(3)} and draws a flow to the last use at \inlinecode{(1)}. Since operation \inlinecode{(2)} is not executed in this branch, there are no flow conflicts, and the code will compile. However, once \inlinecode{z} will get used before \inlinecode{4}, the rule that there can only be multiple immutable references to a variable gets violated, and the borrow checker denies compilation~\cite{rustForRustaceans}.
By adhering to this design, concurrency errors such as race conditions and illegal pointer operations, e.g., use-after-free or double-free errors, can be successfully prevented.
One of the main pitfalls in the C or respectively C++ language is the problem of shared mutual references to dynamically allocated resources. A typical error scenario is freeing or dereferencing pointer variables that another reference has already freed. Rust can prevent this with the ownership design by guaranteeing that once a (pointer) variable is defined, it is correctly initialized with a value. Furthermore, by tracking the variable's lifetime or borrowing operations, the Rust compiler checks that there are no shared mutual references to the variable at any point in time. However, these guarantees are only valid for "safe" Rust code. Within "unsafe" code blocks, memory safety mechanisms of Rust do not hold. In a Rust "unsafe" code blocks, extra features such as dereferencing raw pointers or read/write access on a mutable variable are allowed.
Another essential memory security issue is the out-of-bounds access of buffers
is remedied by Rusts ownership system as it verifies that memory is aligned correctly and that (pointer) variables have to refer to a value of a particular type. Moreover, concerning data storage structures such as \inlinecode{Vec<T>}, Rust keeps track of its length, and automatic range checks are employed~\cite{memsafetyHui}.
However, there are currently still limitations to the Rust ownership design.
When it comes to the implementation of low-level concurrency primitives such
as mutexes (mutual exclusion) or inter-thread communication, which are heavily dependent on mutable shared references, this would not be allowed by the ownership rules. As a circumvention, Rust allows the "unsafe" keyword for a scope in the program code for which shared mutual references are allowed. However, the programmer is then responsible for the safe allocation and deallocation of shared resources~\cite{rustForRustaceans, RustBelt, rustBook}.
From another perspective, the usage of "unsafe" code blocks limits the scope of code where the programmer needs to take extra care of memory safety issues. In the Classic McEliece Rust implementation, no explicit "unsafe" code blocks have been used.
\chapter{Implementation \& Evaluation}
\label{chap:impl_eval}
\section{Implementation}
In the course of the thesis, a port of the Classic McEliece KEM reference implementation in the C programming language to the Rust programming language has been carried out. In order to obtain an overview of the Classic McEliece implementation, there are ten different parameter sets with deviating sizes of $n$ code length, $t$ error-correction capability, and $m$-log of the field size. For every key size variation, the label string is composed of the prefix "mceliece" followed by a number consisting of $n$ appended with $t$. A specific implementation for the semi-systematic form $(\mu, \nu) = (32,64)$ is provided for every implementation variant, which is denoted by a "\textit{f}" postfix. In the Rust port, the individual variants are included in the single implementation and can be enabled via feature flags in contrast to the NIST C submission, where the variants are implemented separately.
\inlinecode{mceliece348864, mceliece348864f, mceliece460896, mceliece460896f, mceliece6688128}
\inlinecode{mceliece6688128f, mceliece6960119, mceliece6960119f, mceliece8192128, mceliece8192128f}
The ported implementation is dependent on the Rust crates \inlinecode{sha3} providing the SHA-3 hash implementation and the \inlinecode{aes} crate representing the AES block cipher implementation used for the random state.
The Classic McEliece KEM variant \inlinecode{mceliece8192128f} was chosen as the first variant to implement in Rust. After \inlinecode{mceliece8192128f} successfully passed the desired key-negotiation unit tests, Lukas Prokop expanded the implementation with the recursive control bits subroutine, random bytes generation, the remaining key sizes, which can be activated with feature flags as well as further code optimizations.
\subsection{High-level interface}
\label{sec:key_gen}
In this section a quick overview of the implemented API is given. An example usage of the KEM implementation is depicted in the code listing below:
\begin{listing}[H]
\begin{minted}{rust}
let mut rng = AesState::new();
let mut pk = [0u8; CRYPTO_PUBLICKEYBYTES];
let mut sk = [0u8; CRYPTO_SECRETKEYBYTES];
let mut ct = [0u8; CRYPTO_CIPHERTEXTBYTES];
let mut ss_alice = [0u8; CRYPTO_BYTES];
let mut ss_bob = [0u8; CRYPTO_BYTES];
crypto_kem_keypair(&mut pk, &mut sk, &mut rng)?;
crypto_kem_enc(&mut ct, &mut ss_bob, &pk, &mut rng)?;
crypto_kem_dec(&mut ss_alice, &ct, &sk)?;
assert_eq!(ss_bob, ss_alice);
\end{minted}
\caption{Example usage of the Rust Classic McEliece API}
\label{list:ApiOverview}
\end{listing}
Initially, the \inlinecode{crypto\_kem\_keypair} function takes the pre-allocated public and secret key arrays, and the random state variable calculates and writes the generated public and secret key bytes in the respective arrays. In the encapsulation step, the public key serves as the input, and a shared key (in this example between the two parties, Alice and Bob) together with a ciphertext is the output. Ultimately for the decapsulation, the secret key and ciphertext are taken as input, and a shared key \inlinecode{ss\_alice} is produced, which should be identical to the one generated by the encapsulation step.
\section{Evaluation}
\subsection{Differences between the C reference implementation and the Rust implementation}
One of the significant obstacles to port classic McEliece from C to Rust is the pointer arithmetic operations. On multiple occasions in C functions, simply a pointer of a particular data type was delivered as a function argument, and it was necessary to first find out if the passed pointer to the function was meant to be operated as an array and, if so, the size of the actual array had to be determined. Since Rust does not support raw pointer arithmetic, either a mutual reference of fixed size or a slice reference was passed to the ported Rust function. The following code example from the \inlinecode{benes.c / benes.rs} subroutine implementation shall illustrate this thought process:
\begin{listing}[H]
\begin{minted}{c}
static void layer_ex(uint64_t* data, uint64_t* bits, int lgs)
{
int i, j, s;
uint64_t d;
s = 1 << lgs;
for (i = 0; i < 128; i += s*2)
for (j = i; j < i+s; j++)
{
d = (data[j+0] ^ data[j+s]);
d &= (*bits++);
data[j+0] ^= d;
data[j+s] ^= d;
}
}
uint64_t r_int_h[2][64];
uint64_t b_int_h[64];
// iter is a integer from 0 to 6
layer_ex(r_int_h[0], b_int_h, iter);
\end{minted}
\caption{Pointer arithmetic example in the C reference implementation}
\label{list:Layer_ex_C}
\end{listing}
In this example in listing 3, the unsigned 64-bit integer pointer variable \inlinecode{data} is a 2D array where the pointer of the start is passed to the function \inlinecode{layer\_ex}. In the C implementation, once the offset variable $s$ is initialized with the value $64$ a wrap-around occurs where implicitly, the start of the next 64 values of the \inlinecode{data} variable are accessed. In Rust, two subarrays (each of size 64) are passed to the \inlinecode{layer\_ex} function, and such a wrap-around operation would result in an out-of-bounds array access. As a consequence, case differentiation has to be carried out when the loop counter modification variable \inlinecode{s} is initialized with $64$. Alternatively, the type of the \inlinecode{data} function parameter variable could be cast into a 1D array with 128 values which is operated on as a reference to the original 2D array. A casting possibility would be to use the bytemuck crate function \inlinecode{cast\_mut} however, this would result in an implicit "unsafe" function call.
\begin{listing}[H]
\begin{minted}{rust}
fn layer_ex(data: &mut [[u64; 64]; 2], bits: &[u64], lgs: usize) {
let data: &mut [u64; 128] = bytemuck::cast_mut(data);
// ...
}
\end{minted}
\caption{Suggestion to cast data parameter of layer\_ex function}
\label{list:Layer_ex_Rust_opt}
\end{listing}
\begin{listing}[H]
\begin{minted}{rust}
fn layer_ex(data: &mut [[u64; 64]; 2], bits: &[u64], lgs: usize) {
let mut data0_idx = 0;
let mut data1_idx = 32;
let s = 1 << lgs;
if s == 64 {
// if counter modification varible 's' exhibits the largest
// possible value both subarrays can be accessed at once
for j in 0..64 {
let mut d = data[0][j + 0] ^ data[1][j];
d &= bits[data0_idx];
data0_idx += 1;
data[0][j + 0] ^= d;
data[1][j] ^= d;
}
} else {
// in a single iteration of loop over 'j' the individual
// 64 size arrays need to be accessed one after the other
let mut i: usize = 0;
while i < 64 {
for j in i..(i + s) {
// access to data[0]
let mut d = data[0][j + 0] ^ data[0][j + s];
d &= bits[data0_idx];
data0_idx += 1;
data[0][j + 0] ^= d;
data[0][j + s] ^= d;
// access to data[1]
d = data[1][j + 0] ^ data[1][j + s];
d &= bits[data1_idx];
data1_idx += 1;
data[1][j + 0] ^= d;
data[1][j + s] ^= d;
}
i += s * 2;
}
}
}
let mut r_int_h = [[0u64; 64]; 2];
let mut b_int_h = [0u64; 64];
layer_ex(&mut r_int_h, &mut b_int_h, iter);
\end{minted}
\caption{Rust implementation of layer\_ex function}
\label{list:Layer_ex_Rust}
\end{listing}
An alternative refactoring simplification compared to listing 5, would be to keep two for loops, depicted listing 3 and replace the initialization of variable \inlinecode{d} with \inlinecode{d = data[s / 64][j + 0] \textasciicircum \ data[s / 64][(j + s) \% 64]}. With this change the division of 64 toggles between the indexation of the first or second array and modulo 64 maintains the bounds in order to index the 64 individual values.
Furthermore, while crafting the implementation, care was taken not to introduce unnecessary code flow branches dependent on secret information, such as access to private key bits. In side-channel attacks, the attacker utilizes indirect information leaked by the software implementation in order to obtain secret information used in the implementation, such as the private key. In order to successfully exploit such an attack, the pre-condition of a correlation between application runtime and secret information needs to be fulfilled. An example can be a secret-dependent control flow, where code path execution depends on changes in the secret~\cite{sideChannelsInCrypto}. Many side-channel attack preventions built into the C reference implementation are dependent on low-level bit manipulations, which are also supported by the Rust language. Especially on value subtraction/addition usages in Rust the functions \inlinecode{overflowing\_shr} or \inlinecode{wrapping\_shr}, \inlinecode{wrapping\_add}, \inlinecode{wrapping\_sub} were applied in order to avoid overflow panics~\cite{rust_methods}.
Another interesting observation of the Classic McEliece implementation is the \inlinecode{uint64\_minmax} function which is used in the constant-time \textit{djbsort} algorithm during the public-key generation \inlinecode{pk\_gen.rs} subroutine~\cite{djb-sort, classicMcElieceSubmission3}. The C implementation of \inlinecode{uint64\_minmax} exhibits only correct behavior for unsigned integers up to 63-bit. The function should be renamed to \inlinecode{int64\_MINMAX} since the desired behavior can only be observed for input values up to and including $2^{63}$. Only the most significant bit is kept, representing the occurrence of an overflow. The rest is discarded, as illustrated in Listing 5.
\begin{listing}[H]
\centering
\begin{minted}[linenos]{c}
#define uint64_MINMAX(a,b) \
do { \
uint64_t c = b - a; \
c >>= 63; \
c = -c; \
c &= a ^ b; \
a ^= c; \
b ^= c; \
} while(0)
\end{minted}
\caption{uint64\_MINMAX C implementation}
\label{list:uint64_minmax_C}
\end{listing}
In Rust, the function was modified in order to compute the overflow detection in a branch avoiding fashion, placing the result indicating if an overflow occurred, in the highest bit position of variable \inlinecode{d}. Using the two's complement the subtraction operation $a - b$ can be represented as $a + !b + 1$. The result of $(!b \ \& \ a)$ is exactly one, if the addition $(a + !b + 1)$ produced a carry. The term $(!b \ | \ a)$ represents all positions where in the addition operation a carry has been forwarded. Details are described in the book \textit{Hacker's Delight} on page 40~\cite{hackersDelight}. Calculating the overflow detection via \inlinecode{let (\_, c) = b.overflowing\_sub(a);} would create a \textit{bltu} branch instruction on the RISC-V 64-bit architecture. This behaviour was observed using the rust compiler version \inlinecode{rustc 1.60.0 (7737e0b5c 2022-04-04)} and compiler flags \inlinecode{--target=riscv64gc-unknown-linux-gnu -O}.
\begin{listing}[H]
\begin{minted}{rust}
const fn uint64_minmax(mut a: u64, mut b: u64) -> (u64, u64) {
let d: u64 = (!b & a) | ((!b | a) & (b.wrapping_sub(a)));
let mut c: u64 = d >> 63;
c = 0u64.wrapping_sub(c);
c &= a ^ b;
a ^= c;