High-speed memory encryption and decryption in embedded system
Tan, Yng Tzer
Date of Issue2017
School of Computer Science and Engineering
Memory authentication is gaining importance in embedded systems as off-chip memories are prone to splicing and spoofing attacks. To maintain confidentiality and integrity of data in external memories, the data are encrypted before storing, decrypted during reading, and memory integrity tree techniques are employed to verify the authenticity of the data. However, the number of encrypt/decrypt operations for each memory access, the number of verification executed when accessing a node and number of updates executed when a node is modified, correspond to the height of the tree, i.e. log2N, where N is the number of data for a memory integrity tree with arity of 2. Also, each verification and update operation requires two memory access to fetch the parent and the child node. As such, frequent off-chip memory accesses will incur large performance and power consumption overhead due to the verification process. This overhead increases notably with applications that have large data (i.e. N). The memory integrity tree used in this report is based on Tamper Evident Counter (TEC) tree. It is developed in C-language and implemented on Altera DE II FPGA board with NIOS II processor. Two methods will be presented to increase the performance of the memory authentication. The first method includes introducing suitable cache configurations to mitigate the memory access time and allow the CPU to run at its full capacity. Various configurations such as cache sizes and cache line length will be studied to determine the optimal configuration for a given application. In addition, the feasibility of employing cache oblivious algorithm to optimize multiple caches with different cache lines will be investigated. The second method explores the advantages of mapping the AES encryption and decryption operations into custom hardware blocks in order to further reduce the verification overhead for memory authentication. Applications from the widely known CHStone benchmark are used to evaluate the proposed methods. Experimental results show that optimal cache configurations can result in a performance gain of at least 200%, while utilizing custom instructions for the encryption/decryption operations lead to a performance gain of at least 300%. The combine use of optimal cache configurations and custom instruction results in an overall performance gain of 700%.
Final Year Project (FYP)
Nanyang Technological University