«Abstract. Performing a Diﬀerential Power Analysis (DPA) attack requires knowledge in several ﬁelds; statistics and cryptography for the attack ...»
Power Analysis Tutorial
Manfred Aigner and Elisabeth Oswald
Institute for Applied Information Processing and Communication
University of Technology Graz
Inﬀeldgasse 16a, A-8010 Graz, Austria
Abstract. Performing a Diﬀerential Power Analysis (DPA) attack requires knowledge in several ﬁelds; statistics and cryptography for the
attack itself, programming skills and experience in instrumentation to
build up an automatic measurement system and electronical skills to improve the results. This tutorial provides information on all these topics on basis of our experience.
1 Introduction Since increasingly conﬁdential data are being exchanged on electronic way an ever greater importance is attached to the protection of the data. Where cryptosystems are being used in real applications not only mathematical attacks have to be taken into account. Hard- and software implementations themselves present a vast ﬁeld of attacks. Side-Channel-Attacks exploit information that leaks from a cryptographic device. Especially one of these new attacks has attracted much attention since it has been announced. This method is called Diﬀerential Power Analysis (DPA) and was presented in 1998 by Cryptography Research. DPA uses the information that naturally leaks from a cryptographic hardware device, namely the power consumption. A less powerful variant, the Simple Power Analysis (SPA) was also announced by Cryptography Research.
What does a DPA attack require? First, an attacker must be able to precisely measure the power consumption. Second, the attacker needs to know what algorithm is computed, and third an attacker needs the plain- or ciphertexts. The strategy of the attacker is to make a lot of measurements, and then divide them with the aid of some oracle into two or more diﬀerent sets. Then, statistical methods are used to verify the oracle. If and only if the oracle was right, one can see noticeable peaks in the statistics. This vague description of a DPA attack should be cleariﬁed in this article. In section 2, a power model is developed and related to the statistical methods used in DPA. Thereafter, a DPA attack is explained on the grounds of the DES. In the third section, a concrete implementation of the DPA is discussed. The section begins with a C++-model which will turn out to be useful to verify some countermeasures against DPA attacks. Also an attack on a 8052-microprocessor implementation is described. In the fourth section the application of DPA on asymmetric cryptosystems is discussed.
2 Power Analysis Foundations Almost every digital circuit built today is based on Complementary Metal Oxid Semiconductor (CMOS ) technology. Therefore it is necessary to understand the power consumption characteristics of this technology. If a CMOS gate changes its state, this change can be measured at the Vdd (Vss ) pin. The more circuits change their state, the more power is dissipated. In a synchronous design, gates are clocked which means that all gates change their state at the same time. Power dissipated by the circuit can be monitored by using a small resistor Rm in series between Vdd, (or Vss ) and the true source (or ground). The two most essential parts of the power consumption during a change of a state are the dynamic charge resp. discharge (appr. 85%) and the dynamic short circuit current (appr.
15%). This is sketched on the example of an inverter (see Figure 1). The output of each gate has a capacitive load, consisting of the parasitic capacity of the connected wires and gates of the following stages. An input transition results in an output transition, which discharges or charges this parasitic capacity, causing a current ﬂow to Vdd (or Vss ). This current is the dynamic charge resp. discharge current. For further information one should take a look at . By measuring current ﬂow on Vdd we can detect whether the output changed from 0 to 1 or not.
In diﬀerential CMOS logic, every output appears also in its inverted form, which means a transition always causes charge and discharge on the output and inverted output. By measuring current on Vdd or Vss one can’t distinguish high and low transitions, but it is possible to detect whether a transition occured or not.
Logic with precharge characteristic always charges the output capacity during a precharge cycle and decharges it during the evaluation cycle, in case that the output value diﬀers from the precharge value. By observing current ﬂow one can detect changes of the output node. Precharge logic has much higher power consumption than diﬀerential or standard CMOS logic, because dynamic charge current appears also in situations where the output value doesn’t toggle.
2.1 Power Model
As a result of the previous explanation we can deduce an
model of the instantaneous power consumption of a CMOS circuit. The power consumption of a circuit at a particular time t is the sum of the power dissipated by all gates at this time. Of course, when measuring this power dissipation we cannot prevent the inﬂuence of noise. Various noise components have to be considered such as external noise, intrinsinc noise, quantization noise and algorithmic noise (see ). By the careful use of measurement equipment one can reduce external noise. Algorithmic noise will be reduced by DPA itself. The other two noise components should be small compared to the power consumption. We can state
this more formally as :
Simple Power Model. Let t denote the time, and N (t) be a normal distributed random variable which represents the noise components. Let f (g, t) denote the power consumption of gate g at the time t. Then a simpliﬁed power model for the power consumption is the function
The next step is to relate this model to statistics. If we consider the function f (g, t) as random variable from an unknown probability distribution, what can we say about P (t)? If all f (g, t) are randomly and independently drawn from this probalility distribution then the Central Limit Theorem says that P (t) is normally distributed. In a DPA attack the attacker divides the power measurements in two or more diﬀerent sets and tries to compute the diﬀerence between these sets in order to verify the oracle. As we have related the power consumption to statistics we can also say that the attacker wants to compute the diﬀerence between the two probability distributions. The methods therefore are discussed in the next section.
2.2 Hypothesis Testing
If one works with probability distributions it is necessary to have characterizations of these distributions. Well known characteristics are the expectation, variance or more in general the moments of the distribution. Of course the true expectation (or true variance) is unknown, so one has to estimate it. The construction of good estimators is one of the main goals in statistics. One can easily proof that if the Xi are independently, identically distributed random variables, the statistical mean X and variance S 2 are good estimators for the true expec¯ tation E(Xi ) = µ and the true variance V ar(Xi ) = σ 2. Common strategies for
constructing estimators are:
– using the empirical moments as estimators for the theoretical moments – using the Maximum-Likelihood method.
2.3 Diﬀerential Power Analysis In the previous section the foundations for DPA have been explained. Now we start with the power analysis itself. In the ﬁrst part of this section we will give a short review of the DES. Then the construction of an oracle for a knownciphertext-attack is explained in this context.
The Data Encryption Standard () was invented in 1970 by IBM. It has a
Feistel-Structure and consists of 16 rounds:
Ki denotes the subkey of the i − th round. The last DES round diﬀers form the others, because L and R are not exchanged (see Figure 3). The oracle or as we now call it, selection funtion D makes use of this fact. As we can see in the Figure 3, R15 = L16. The subkey splits up in eight blocks, one for every sbox (see Figure 2). Therefore we speciﬁy one target sbox for which we list all possible (= 26 ) input values. We will refer to such an input value as subkeyblock. As assumed above we know the ciphertexts, and so we can calculate the value of some of the bits in L15 for every possible subkeyblock. We select one of these bits as our target bit. The value of the target bit is our selection function D. If D = 1 the corrsponding power measurement will be put in sample set S1, if D = 0 it is classiﬁed to S0. This procedure is repeated for a lot of measurements, so at the end we have, for every ciphertext and all subkeyblocks, a classiﬁcation of the corresponding measurement. Let n denote the amount of ciphertexts, resp.
measurements. Then we can write all our classiﬁcations in a 26 × n matrix.
So every line represents a possible key for the target sbox, and every column represents the classiﬁcation of one ciphertext resp. measurement.
For the DPA attack we go through all lines and build the two sample sets S0 and S1. Then we compute the mean (pointwise) of the samples in the sets, M 0 and M 1, and compute the diﬀerence. For the correct subkeyblock there must be a peak in the trace of the diﬀerence.
3 Practical Implementation This section considers implementation issues. Let’s start with some SPA experiments to familiarize ourselfs with the measurement equipment. SPA tries to identify single instructions in the power trace without statistical methods. It can be used to detect the portions in the powertrace where the target bit for DPA is manipulated and it can be used to develop a good measurement setup. We used a 8051 compatible ATMEL 89S8252 microcontroller for our experiments.
In the ﬁrst stage it is useful to perform simple operations on the microprocessor and play around with the various possibilities for the measurement setup.
We execute a number of mov addr, #0 and mov addr, #255 instructions and measure the power consumption. The aim of this experiment is to optimize the measurement setup of the microcontroller board. It is easy to see that noise on the power supply reduces the precision of measurements seriously. The ﬁrst step to improve our setup is reducing noise on the power supply. The next step is to choose the right value for resistor Rm between global supply and the supply pin of the controller across which we measured the current proﬁle. Bigger Rm would mean higher voltage swing across the resistor, which would be easier to measure. One has to keep in mind that this voltage drop across Rm reduces the actual supply voltage of the controller which reduces the power consumption.
Therefore it is clear that big values for Rm do not directly lead to the desired eﬀect of higher voltage swing. Power consumption itself also depends heavily on the amount of the supply voltage, so to obtain better results one should run the device at the highest supply voltage possible.
A second eﬀect of the reduced supply voltage has to be taken into account:
Input protection circuits of CMOS pads include clamp diodes which turn on, when voltage on the input pad is higher than the circuit’s Vdd. Introducing Rm leads to reduced Vdd, which makes these diodes conductive when the input value is high and voltage drop on Rm is big. This means the internal circuit is supplied by input current from the input pads, which is not measured via Rm. Smaller values of Rm reduce this eﬀect. Another way to get rid of this eﬀect is to make the global supply voltage bigger than the high level, but this would reduce the high noise margin of the circuit.
Although the CMOS circuit still works with a big Rm the current proﬁle is more inﬂuenced by a bigger resistor. We measured nearly the same voltage swing on Rm values for Rm = 1Ω and Rm = 20Ω.
Since power consumption of CMOS logic arises mostly around clock transitions, the current proﬁle has high frequency components, which lead to voltage overshoot on Rm. To reduce this overshoot, a small fast capacitor should be connected parallel to Rm. The best way to ﬁnd out the optimal value of this capacitor is to try diﬀerent devices and decide after some measurements if the desired eﬀect has happened. The input capacitance and resitance of the oscillator probe used for measurements has also a big impact on this eﬀect, so use an active diﬀerential probe with high input resistance and very low input capacitance.
After reducing noise of the power supply, ﬁnding the best resistor Rm and capacitores Cm it is possible to detect the 2-bit diﬀerences of two consecutive mov-instructions of our microcontroller board on the oszilloscope.
See Figure 4 for the current proﬁle of two mov commands. Graph A and B show each the proﬁle of a mov command while C is the zoomed diﬀerence of these two samples. It’s even easier to detect diﬀerencies of distinct commands.
With a good setup we can even ﬁnd out if the branching condition of a JNZ or JZ command was fulﬁlled or not.
The next step is to implement a DPA attack. We do this on a DES C++-model ﬁrst. For the functional veriﬁcation of the implementation of an algorithm, often C++ is used before the algorithm is written in a hardware description language (HDL). In the top-down design ﬂow one splits up the (hardware) module in submodules. The functionality of each of the submodules and the connections between them can be deﬁned in the C++-model too. Finally one can produce cycle-tuned test vectors to verify the HDL-model.
3.1.1 Modelling the Power Consumption in the C++-Model. If one has a bit-level model of a hardware modul, it is fairly easy to model the dynamic charge (discharge) of the capacitances too. It is suﬃcient to assert that the change of state of a bit implicates current ﬂow in the type of logic used. After all bits of a cycle are processed, a variable holds the value of the instantaneous power consumption. In the following example this procedure is sketched for standard CMOS logic and diﬀerential CMOS logic.