Security 0000 *Implementation results* 0000

Conclusion

# CAESAR candidate SCREAM Side-Channel Resistant Authenticated Encryption with Masking

## Vincent Grosso<sup>1</sup> <u>Gaëtan Leurent</u><sup>1,2</sup> François-Xavier Standert<sup>1</sup> Kerem Varıcı<sup>1</sup> François Durvaux<sup>1</sup> Lubos Gaspar<sup>1</sup> Stéphanie Kerckhof<sup>1</sup>

<sup>1</sup>UCL, Belgium & <sup>2</sup>Inria, France scream@uclouvain.be

#### **DIAC 2014**

G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 1 / 21

## Authenticated Encryption

#### Many different ways to build authenticated encryption

- Block cipher based
  - 2-pass: GCM, CCM, ...
  - 1-pass: OCB, ...
  - Nonce-misuse resistant: SIV, COPA, POET, ...
- Permutation based
  - SpongeWrap, DuplexWrap, MonkeyWrap, APE, ...
- Stream cipher + MAC
  - Encrypt-then-MAC, MAC-then-Encrypt, Encrypt-and-MAC
- Dedicated
  - Helix/Phelix, ALE, ...

Security 0000 *Implementation results* 0000

Conclusion

## Authenticated Encryption

Many different ways to build authenticated encryption

*Birthday bound security* 

Most block cipher-based and permutation-based modes only have birthday bound security

They need a 2*n*-bit primitive to resist attacks with 2<sup>*n*</sup> data and 2<sup>*n*</sup> time Side question: is this n-bit security or 2*n*-bit security?

- Use a 128-bit primitive: low security
- Design a larger primitive: larger hardware

G. Leurent (UCL, Inria)

Security 0000 *Implementation results* 0000

Conclusion

## Authenticated Encryption

#### Many different ways to build authenticated encryption

Birthday bound security

Most block cipher-based and permutation-based modes only have birthday bound security

They need a 2*n*-bit primitive to resist attacks with 2<sup>*n*</sup> data and 2<sup>*n*</sup> time Side question: is this n-bit security or 2*n*-bit security?

#### *Beyond birthday security*

Tweakable Block Ciphers provide security beyond the birthday bound. Modes with an *n*-bit TBC resist attacks with  $2^n$  data and  $2^n$  time.

G. Leurent (UCL, Inria)

Security 0000 *Implementation results* 0000



Definition (Tweakable block cipher – Liskov, Rivest, Wagner)

Family of permutation indexed by a key K (secret) and a tweak T (public)

For each tweak  $T, x \mapsto E_K(T, x)$  is an idenpendant PRF

- TAE: Tweakable Authenticated Encryption (Liskov, Rivest, Wagner)
  - Nonce-based AEAD, inspired by OCB
  - Tweak is Nounce+Counter
  - Full *n*-bit security

G. Leurent (UCL, Inria)

Security 0000 *Implementation results* 0000

### Tweakable block cipher based AE modes



#### TAE Features

- Fully parallelizable
- 128-bit security with 128-bit state
  - + key, nounce, checksum
- Low overhead (1TBC); good for small messages
- Minimal extension
- Patent-free?

Security 0000 *Implementation results* 0000

Conclusion

### TBC design

We want to design a tweakable block cipher that is efficient on wide range of platform and secure.

- Side-channel resistance necessary in many lightweight settings
  - Avoid your car keys / credit card being cloned
- Usual approach:
  - 1 Design a secure cipher (AES, PRESENT, Noekeon, ...)
  - 2 Implement with side-channel countermeasures
- ▶ We use LS-Designs, with a reverse approach:
  - Use operations that are easy to mask
  - In order to design a secure cipher
- Previous work: Zorro, PICARO

G. Leurent (UCL, Inria)

Security 0000 *Implementation results* 0000

Conclusion

### TBC design

We want to design a tweakable block cipher that is efficient on wide range of platform and secure.

- Side-channel resistance necessary in many lightweight settings
  - Avoid your car keys / credit card being cloned
- Usual approach:
  - 1 Design a secure cipher (AES, PRESENT, Noekeon, ...)
  - 2 Implement with side-channel countermeasures
- We use LS-Designs, with a reverse approach:
  - 1 Use operations that are easy to mask
  - 2 In order to design a secure cipher
- Previous work: Zorro, PICARO

Security

*Implementation results* 0000

Conclusion

## *Choice of operations*

#### Important remark

Logic gates are easier to mask than table-based S-boxes (If we target Boolean masking)

- Use bitsliced S-boxes (SERPENT, Noekeon, ...)
  - One word contains the msb (resp. 2<sup>nd</sup> bit, ...) of every S-box
  - Bitwise operations: 8 S-boxes in parallel using 8-bit words
  - Use a small number of non-linear gates
- We can use tables for the diffusion layer!
  - Efficient, good diffusion
  - Easy to mask (linear)

Security

*Implementation results* 0000

Conclusion

## *Choice of operations*

#### Important remark

Logic gates are easier to mask than table-based S-boxes (If we target Boolean masking)

- Use bitsliced S-boxes (SERPENT, Noekeon, ...)
  - One word contains the msb (resp. 2<sup>nd</sup> bit, ...) of every S-box
  - Bitwise operations: 8 S-boxes in parallel using 8-bit words
  - Use a small number of non-linear gates
- We can use tables for the diffusion layer!
  - Efficient, good diffusion
  - Easy to mask (linear)

Conclusion

## LS-designs

#### Mathematical description: SPN network

- S-boxes
  - With simple gate representation
- Linear diffusion layer
  - Mixes the full state
  - Binary coefficients
- Good design criterion: wide-trail



#### Bitslice implementation:

- S-box as a series of bitwise operations with CPU words
- L-box tables for diffusion layer
- Easy to mask (simple non-linear ops., complex linear ops.)

G. Leurent (UCL, Inria)

Security

*Implementation results* 0000

Conclusion

### LS-designs

 $x \leftarrow P \oplus K$ for  $0 \le r < N_r$  do ▷ S-box layer: for  $0 \le i < l$  do  $x[i, \star] = S[x[i, \star]]$ ▷ L-box layer: for  $0 \le j < s$  do  $x[\star,j] = L[x[\star,j]]$ ▷ Key addition:  $x \leftarrow x \oplus k_r$ 

#### return x

G. Leurent (UCL, Inria)

State as a bit-matrix



S-box layer

| /        | \<br>\            |
|----------|-------------------|
| <b>\</b> | /                 |
| ←        | $\longrightarrow$ |
|          |                   |
| <        | $\rightarrow$     |
| _        | ~                 |
| <b>\</b> |                   |

L-box layer

Security

*Implementation results* 0000

Conclusion

### *SCREAM S*-box and *L*-box

For SCREAM, we reuse the components of Robin/Fantomas:

- 8-bit S-box
  - Built from 3 smaller S-boxes (Feistel-like structure)
  - $Pr_{lin} = 2^{-2}$ ,  $Pr_{diff} = 2^{-4}$ , 11/12 non-linear gates
- 16-bit L-box
  - Branch number 8 (optimal for a binary matrix)
  - Orthogonal matrix: differential and linear properties equivalent
  - Built from RM(2, 5) or QR[32, 16, 8]

Security 0000 *Implementation results* 0000

Conclusion

## Tweak/Key schedule

- Robin/Fantomas with a tweak/key schedule
  - 128-bit block
  - 128-bit key
  - 128-bit tweak
- Tweak and key have a similar role (cf. TWEAKEY framework)
- ▶ Must be secure against chosen-tweak attacks (≈ related-key)
- Use ideas from LED:



- One step is two rounds: B active S-Boxes
- At least half the steps are active with related-key

G. Leurent (UCL, Inria)

Conclusion

### *iSCREAM*: *involutive components*

Tweak every step; key every second step



Rotation avoids optimal trails with tweak difference

- $\Delta \rightarrow \Delta$ : 8 active S-Boxes (involution)
- $\Delta \rightarrow \Delta \stackrel{16}{\lll} 1$ : 12 active S-Boxes

## SCREAM: non-involutive components

- ▶ Key-schedule based on a [3, 2, 2]<sub>4</sub> code.
  - Two consecutive subkeys cannot be inactive (with related key).
  - Tweak difference gives the same *truncated* difference in all subkeys.



- Optimize L-box to avoid specific trails
  - 1-R trails  $\Delta \rightarrow \Delta$  have at least 14 active S-boxes
  - RK trails with consecutive active steps are equivalent to SK trails
    - 4-R trail -xx- with tweak difference  $\delta$
    - $\delta \rightsquigarrow a, b \rightsquigarrow \delta$  gives  $b \rightsquigarrow \delta \rightsquigarrow a$ ; at least 20 active S-boxes

G. Leurent (UCL, Inria)

Security

*Implementation results* 0000

Conclusion

#### Outline

SCREAM design TAE Mode LS-Design TBC

Security Security Analysis Initial Mistakes

Implementation results Software Hardware

Conclusion

| SC. | AN. | sign |
|-----|-----|------|
|     |     |      |

- ► Fixed key ⊕ Chosen tweak ≈ Related key At least one half of the steps active
- ► Related key ⊕ Chosen tweak ≈ Related key with more freedom At least one half/one third of the steps active (iScream/Scream)
- Wide-trail strategy: each active 2-round step has at least 8 active S-boxes.



- ► Fixed key ⊕ Chosen tweak ≈ Related key At least one half of the steps active
- ► Related key ⊕ Chosen tweak ≈ Related key with more freedom At least one half/one third of the steps active (iScream/Scream)
- Wide-trail strategy: each active 2-round step has at least 8 active S-boxes.



- ► Fixed key ⊕ Chosen tweak ≈ Related key At least one half of the steps active
- ► Related key ⊕ Chosen tweak ≈ Related key with more freedom At least one half/one third of the steps active (iScream/Scream)
- Wide-trail strategy: each active 2-round step has at least 8 active S-boxes.



- ► Fixed key ⊕ Chosen tweak ≈ Related key At least one half of the steps active
- ► Related key ⊕ Chosen tweak ≈ Related key with more freedom At least one half/one third of the steps active (iScream/Scream)
- Wide-trail strategy: each active 2-round step has at least 8 active S-boxes.

| Minimum nu  | mber of active          | S-Ba          | oxes   |               |         |                |          |                 |          |                 |          |          |          |
|-------------|-------------------------|---------------|--------|---------------|---------|----------------|----------|-----------------|----------|-----------------|----------|----------|----------|
| Setting     | Steps:                  | 1             | 2      | 3             | 4       | 5              | 6        | 7               | 8        | 9               | 10       | 11       | 12       |
| Single Key  | Scream-10<br>iScream-12 | 0<br>0        | 0<br>0 | 8<br>8        | 8<br>8  | 16<br>16       | 16<br>16 | 24<br>24        | 24<br>24 | 32<br>32        |          |          |          |
| Related Key | Scream-12<br>iScream-14 | <b>0</b><br>0 | 0<br>0 | <b>8</b><br>8 | 8<br>16 | <b>8</b><br>16 | 16<br>16 | <b>16</b><br>24 | 16<br>32 | <b>24</b><br>32 | 24<br>32 | 24<br>40 | 32<br>40 |

Security ○●○○ *Implementation results* 0000

Conclusion

## Improved Security Analysis

- Components designed to make those simple trails expensive.
  - Combine analysis at step level, and analysis at S-box level
- Optimal trails have two third of the steps active (fixed key).
  - See submission for more details

| Minimum number of active S-Boxes |                         |               |        |                 |          |                 |                 |                 |                 |                 |    |    |    |
|----------------------------------|-------------------------|---------------|--------|-----------------|----------|-----------------|-----------------|-----------------|-----------------|-----------------|----|----|----|
| Setting                          | Steps:                  | 1             | 2      | 3               | 4        | 5               | 6               | 7               | 8               | 9               | 10 | 11 | 12 |
| Single Key                       | Scream-10<br>iScream-12 | <b>0</b><br>0 | 8<br>8 | <b>14</b><br>12 | 20<br>16 | <b>28</b><br>24 | <b>35</b><br>28 | 32              | 40              |                 |    |    |    |
| Related Key                      | Scream-12<br>iScream-14 | <b>0</b><br>0 | 0<br>0 | <b>8</b><br>8   | 14<br>16 | <b>14</b><br>16 | 22<br>16        | <b>28</b><br>24 | 28<br><b>32</b> | <b>36</b><br>32 | 32 | 40 | 48 |

Security

*Implementation results* 0000

Conclusion

#### SCREAM v1 problem

In SCREAM v1, we tried to optimize the use of counters in TAE... ...and failed :-(

In SCREAM v2 we stick to the original TAE.

Thanks

Thanks to Wang Lei and Sim Siang for finding out the mistake!

G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 14 / 21

Security ○○○● *Implementation results* 0000

Conclusion

### iSCREAM problem

iSCREAM uses an involutive S-Box and L-Box...

...with some unexpected properties :-(

The strong structure of the involutive L-Box, combined with low-weight round constants, allows a self-similarity attack with weak keys or related keys.

We focus on SCREAM at the moment We plan to redesign iSCREAM in the future

Simple tweak: add full constants

#### Thanks

Thanks to Henry Gilbert, Gregor Leander, Brice Minaud, Sondre Rønjom for finding out!

G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 15 / 21

Security 0000  $\substack{ Implementation \ results \\ \circ \circ \circ \circ }$ 

Conclusion

#### Outline

SCREAM design TAE Mode LS-Design TBC

*Security* Security Analysis Initial Mistakes

*Implementation results* Software Hardware

Conclusion

## Implementation: High-end CPUs

- Use large registers (128-bit) for bitsliced S-box
- Use vector permute instructions for L-box
  - 4-bit to 8-bit table with pshufb in SSSE3, vtbl in NEON
  - 16-bit to 16-bit table as 8 small tables
  - Constant time (no cache timing side-channel)

#### Results

- Fantomas has performances close to AES (excluding hardware AES)
- Tweak gives more security, requires more rounds (20 vs. 12)
- The TAE mode has a very small overhead
- Performances similar to AES-GCM

(excluding hardware AES)

G. Leurent (UCL, Inria)

Security 0000 Implementation results  $\bullet \circ \circ \circ$ 

Conclusion

### Implementation: High-end CPUs

#### Software performance for long messages (cycles/byte)

|                   | SCREAM | Scream | Fantomas | AES-GCM | AES  |
|-------------------|--------|--------|----------|---------|------|
| ARM Cortex A15    | 23.5   | 21.8   | 14.2     | 31.1    | 17.8 |
| Atom              | 56     | 55     | 33.3     | 28.8    | 17   |
| Nehalem           | 10.8   | 9.4    | 6.3      | 9.9     | 6.9  |
| Ivy Bridge AES-NI | 8.0    | 7.1    | 4.2      | 8.3     | 5.4  |
| Ivy Bridge AES-NI |        |        |          | 2.5     | 1.3  |

#### More detailed benchmarks soon in eBASH...

G. Leurent (UCL, Inria)

Security 0000 Implementation results  $\bullet \circ \circ \circ$ 

Conclusion

Implementation: High-end CPUs

Software performance for long messages (cycles/byte)

|                                  | SCREAM | Scream        | Fantomas | AES-GCM      | AES                 |
|----------------------------------|--------|---------------|----------|--------------|---------------------|
| ARM Cortex A15                   | 23.5   | 21.8          | 14.2     | 31.1         | 17.8                |
| Atom                             | 56     | 55            | 33.3     | 28.8         | 17                  |
| Nehalem                          | 10.8   | 9.4           | 6.3      | 9.9          | 6.9                 |
| Ivy Bridge AES-NI                | 8.0    | 7.1           | 4.2      | 8.3          | 5.4                 |
| Ivy Bridge AES-NI                |        |               |          | 2.5          | 1.3                 |
| Haswell AES-NI<br>Haswell AES-NI | 5.7?   | 4.7?<br>ORK I | N PRO    | ??<br>Calles | ??<br><b>5</b> 0.75 |

More detailed benchmarks soon in eBASH...

G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 17 / 21

Security 0000 Implementation results  $\bullet \circ \circ \circ$ 

Conclusion

### Implementation: High-end CPUs

#### Software performance for long messages (cycles/byte)

|                                  | SCREAM  | Scream        | Fantomas | AES-GCM | AES                 |
|----------------------------------|---------|---------------|----------|---------|---------------------|
| ARM Cortex A15                   | 23.5    | 21.8          | 14.2     | 31.1    | 17.8                |
| Atom                             | 56      | 55            | 33.3     | 28.8    | 17                  |
| Nehalem                          | 10.8    | 9.4           | 6.3      | 9.9     | 6.9                 |
| Ivy Bridge AES-NI                | 8.0     | 7.1           | 4.2      | 8.3     | 5.4                 |
| Ivy Bridge AES-NI                |         |               |          | 2.5     | 1.3                 |
| Haswell AES-NÍ<br>Haswell AES-NI | 5.7?    | 4.7?<br>ORK I | N PRC    |         | ??<br><b>5</b> 0.75 |
| Future Intel CPU                 | AVX512, | VPTERNL       | _OG,     |         |                     |

More detailed benchmarks soon in eBASH...

G. Leurent (UCL, Inria)

### Implementation: AVR micro-controller

- TBC performance: 7650 cycles
  - Using 1kB table
  - Smaller tables if needed
- For many embedded devices, side-channel attack are a real threat
- SCREAM has very good performances for masked implementations
  - Noekeon also very good (similar components)

Security 0000  $\begin{array}{c} \textit{Implementation results} \\ \circ \bullet \circ \circ \end{array}$ 

Conclusion

#### Implementation: AVR micro-controller



G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 18 / 21

Security

 $\begin{array}{c} \textit{Implementation results} \\ \circ \circ \bullet \circ \end{array}$ 

Conclusion

### Implementation: Hardware

- We study implementations with a 128-bit datapath
  - Reasonable price/performance ration
- Low amount of logic in one round
  - We can unroll one full step per clock cycle
  - One step ≈ one AES round
  - ► SCREAM TBC ≈ AES
- Low overhead for TAE mode
  - Limited extra memory: small total state

Security

 $\substack{Implementation \ results \\ \circ \circ \bullet \circ }$ 

Conclusion

### Implementation: Hardware

#### Hardware performance of the TBC: ASIC

| Су        | cle | Mode<br>E,D,ED | Area<br>[µm²] | f <sub>max</sub><br>[MHz] | Latency<br>[cycles] | Throughput<br>[Mbps] |
|-----------|-----|----------------|---------------|---------------------------|---------------------|----------------------|
| AES       | 1R  | E              | 17921         | 444                       | 12                  | 4740                 |
|           |     | D              | 20292         | 377                       | 22                  | 2195                 |
|           |     | ED             | 24272         | 363                       | ≈17                 | ≈2997                |
| Scream-10 | 1R  | Е              | 12951         | 751                       | 21                  | 4577                 |
|           |     | D              | 12951         | 751                       | 21                  | 4577                 |
|           |     | ED             | 17292         | 751                       | 21                  | 4577                 |
| Scream-10 | 2R  | Е              | 17292         | 446                       | 11                  | 5190                 |
|           |     | D              | 17292         | 446                       | 11                  | 5190                 |
|           |     | ED             | 25974         | 446                       | 11                  | 5190                 |

Security 0000  $\begin{array}{c} \textit{Implementation results} \\ \circ \circ \bullet \circ \end{array}$ 

Conclusion

### Implementation: Hardware

#### Hardware performance of the TBC / full mode: Virtex 6 FPGA

| Су        | cle      | Slices<br>[slices]       | BRAM<br>[×18 <i>k</i> ] | f <sub>max</sub><br>[MHz] | Latency<br>[cycles]                          | Throughput<br>[Mbps]         |
|-----------|----------|--------------------------|-------------------------|---------------------------|----------------------------------------------|------------------------------|
| AES       | 1R       | 562<br>136               | -<br>10                 | 211<br>308                | 11<br>11                                     | 2450<br>3585                 |
| Scream-10 | 1R<br>2R | 251<br>167<br>416<br>190 | -<br>16<br>-<br>16      | 321<br>287<br>193<br>278  | 20<br>20<br>10<br>10                         | 2050<br>1836<br>2470<br>2965 |
| SCREAM-10 | 1R<br>2R | 512<br>571               | _<br>_                  | 302<br>146                | $20 \cdot (\ell + 1) \\ 10 \cdot (\ell + 1)$ | 1932<br>1870                 |

## Implementation: overview

- Hardware:
  - The tweakable block cipher costs about the same as AES
  - Low overhead for TAE mode (limited extra memory)
  - Parallelism can be leveraged in a pipelined implementation

#### Micro-controller:

- Good performance (< 8k cycles)</li>
- Very good if masking is needed

### High-end CPU

- Parallelism exploited with SIMD
- Performance similar to AES-GCM

(excluding hardware AES instructions)

Security

*Implementation results* 0000

Conclusion

## **SCREAM** Features

#### TAE Mode

- Nonce-based AEAD
- Fully parallelizable
- 128-bit security
- Low overhead (1TBC)
- Minimal extension
- Patent-free?

### LS Tweakable Block Cipher

- Clean and simple design
  - SPN, Wide-trail
  - Simple bounds for trails
- Scalable
  - Hardware: small state
  - Microcontrollers: masking
  - High-end CPUs: vectorized

High security, high performances

Small tweaks to fix initial mistakes

• The tweakable block cipher is also a useful primitive in itself.

G. Leurent (UCL, Inria)

FPGA implementation results

### Extra Slides

#### FPGA implementation results

G. Leurent (UCL, Inria)

CAESAR candidate SCREAM

DIAC 2014 22 / 21

## FPGA implementation results

#### Tweakable Block Cipher:

For Virtex 6 (XC6 VLX 240T - 3 FF1156):

|               |         | Π.             | ۰.     | н     | -      | Timing p  | erforma | nce strate | gy               | Area      | reductio | n strategy | ,                |
|---------------|---------|----------------|--------|-------|--------|-----------|---------|------------|------------------|-----------|----------|------------|------------------|
|               | DP size | BRAMs          | UNROLI | REG_O | Cycles | Regs/LUTs | Slices  | BRAMs      | F <sub>max</sub> | Regs/LUTs | Slices   | BRAMs      | F <sub>max</sub> |
|               | 128     | F              | F      |       | 20     | 404/823   | 251     | 0          | 321              | 400/640   | 187      | 0          | 286              |
| 5             | 128     | F              | т      |       | 10     | 399/1520  | 416     | 0          | 193              | 398/1033  | 282      | 0          | 153              |
| bit           | 128     | Т              | F      | Т     | 20     | 401/629   | 205     | 8x18k      | 287              | 400/479   | 147      | 8x18k      | 261              |
| CRI<br>128    | 128     | т              | F      | F     | 20     | 273/609   | 167     | 8x18k      | 287              | 273/460   | 126      | 8x18k      | 261              |
| Š.            | 128     | T <sup>2</sup> | Т      | Т     | 10     | 398/670   | 177     | 16x18k     | 277              | 398/665   | 204      | 16x18k     | 252              |
|               | 128     | T <sup>2</sup> | т      | F     | 10     | 271/667   | 190     | 16x18k     | 278              | 271/643   | 201      | 16x18k     | 252              |
| SCREAM<br>16b | 16      | F              | F      |       | 320    | 780/643   | 222     | 0          | 400              | 260/359   | 107      | 0          | 237              |
| AES1          | 128     | F              | F      |       | 11     | 686/2317  | 815     | 0          | 211              | 526/1431  | 398      | 0          | 154              |
| AES2          | 128     | F              | F      |       | 11     | 619/1712  | 562     | 0          | 211              | 398/1430  | 392      | 0          | 154              |
| AES3          | 128     | Т              | F      |       | 11     | 398/481   | 136     | 10x18k     | 308              | 398/468   | 152      | 10x18k     | 284              |
| AES4          | 128     | Т              | F      |       | 11     | 398/476   | 163     | 10x18k     | 308              | 270/450   | 133      | 10x18k     | 285              |

#### Notes:

<sup>1</sup> Parameter settings: T = True; F = False; --- = not applicable

<sup>2</sup> BRAMs operate on 2x higher clock frequency than the rest of the core

<sup>3</sup> Key initialization requires extra 1 clock cycle for 128b version or 8 clock cycles for 16b version

G. Leurent (UCL, Inria)

## FPGA implementation results

#### Authenticated Encryption (full mode)

|           |         |       |      | -     |        | Timing performance strategy |        |       |                  | Area      | reductio | n strategy | 1                |
|-----------|---------|-------|------|-------|--------|-----------------------------|--------|-------|------------------|-----------|----------|------------|------------------|
|           | DP size | TRUNC | PADD | UNROL | Cycles | Regs/LUTs                   | Slices | BRAMs | F <sub>max</sub> | Regs/LUTs | Slices   | BRAMs      | F <sub>max</sub> |
| <b>5</b>  | 128     | Т     | Т    | Т     | Х      | 917/2193                    | 571    | 0     | 146              | 917/1755  | 459      | 0          | 154              |
| bit       | 128     | Т     | Т    | F     | Y      | 920/1932                    | 512    | 0     | 302              | 919/1392  | 363      | 0          | 289              |
| CRI<br>17 | 128     | Т     | F    | Т     | Х      | 918/2109                    | 567    | 0     | 150              | 917/1766  | 458      | 0          | 149              |
| S         | 128     | Т     | F    | F     | Y      | 920/1588                    | 414    | 0     | 286              | 919/1392  | 362      | 0          | 312              |

X = (A + P + 1)\*10 + 2; Y = (A + P + 1)\*20 + 2; A - number of 128b blocks of associated data, P - number of 128b blocks of the plaintext