View Item 
      •   Home
      • 1. Schools
      • College of Engineering
      • School of Computer Science and Engineering (SCSE)
      • SCSE Student Reports (FYP/IA/PA/PI)
      • View Item
      •   Home
      • 1. Schools
      • College of Engineering
      • School of Computer Science and Engineering (SCSE)
      • SCSE Student Reports (FYP/IA/PA/PI)
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      Subject Lookup

      Browse

      All of DR-NTUCommunities & CollectionsTitlesAuthorsBy DateSubjectsThis CollectionTitlesAuthorsBy DateSubjects

      My Account

      Login

      Statistics

      Most Popular ItemsStatistics by Country/RegionMost Popular Authors

      About DR-NTU

      Understanding and profiling a convolutional neural network application on different computing platforms using OpenCL

      Thumbnail
      Final Year Project Report - Nandi Shuvam (4.406Mb)
      Author
      Nandi, Shuvam
      Date of Issue
      2017
      School
      School of Computer Science and Engineering
      Abstract
      The decline of Moore’s law has led to a fundamental shift in the design of micro-processor architectures. Devices with parallel processing architectures such as GPUs, FPGAs and DSPs initially used specifically for dedicated tasks are now gaining popularity as accelerators for more general-purpose computations. Performance is exploited in these devices by massively parallelising tasks across various compute units. CUDA and OpenCL are two application programming interface (API) models used to program parallel devices. The long-term objective this project seeks to achieve is the design of hypothetical network of multiple processors, capable of running applications in parallel. OpenCL is used to facilitate comparison of performance being a cross-compatible framework across multiple heterogeneous platforms. Initially, this report examines the performance of numerous computing devices. A simple matrix multiplication kernel was executed with different mappings of the kernel onto the devices. This was followed by profiling a complex application recognising handwritten digits from the MNIST database. Performance in terms of GOPS was computed from the execution timings obtained and by analysing the number of computations performed in the application. The second half of this project investigates free ISAs for implementing a processor as the core unit of the hypothetical engine. RISC-V is picked and studied as it provides several extensions to its base integer instruction set, thereby supporting computationally intensive tasks. An existing processor implementation is examined, followed by developing a new implementation based on RV32IM.
      Subject
      DRNTU::Engineering::Computer science and engineering::Computer systems organization::Processor architectures
      DRNTU::Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
      DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
      DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems
      Type
      Final Year Project (FYP)
      Rights
      Nanyang Technological University
      Collections
      • SCSE Student Reports (FYP/IA/PA/PI)

      Show full item record

      Related items

      Showing items related by title, author, creator and subject.

      • Wearable technology for holistic entertainment experience 

        Zhu, Lingfei (2016)
        A solution for immersive holistic entertainment system, including the hardware feedback jacket, a Head Mount Device, and a PC game interface. It can interact with the user's head and hand movement or gestures, and provide ...
      • Markerless motion capture and analysis based on depth images 

        Bian, Zhenpeng (2015)
        The works presented in this thesis focus on depth images based human motion capture in realistic daily scenarios and two novel motion analysis frameworks on fall detection and human-computer interface based on motion capture ...
      • Creation of dynamic contact network through agent-based simulation 

        Leow, Guan Hao (2013)
        This project is a refinement of a previous Final Year Project, Agent-based Simulation – Canteen (SCE11-0028). It is a canteen environment simulation developed using Crowd Simulation for Military Operations (COSMOS), a ...

      NTU Library, Nanyang Avenue, Singapore 639798 © 2011 Nanyang Technological University. All rights reserved.
      DSpace software copyright © 2002-2015  DuraSpace
      Contact Us | Send Feedback
      Share |    
      Theme by 
      Atmire NV
       

       


      NTU Library, Nanyang Avenue, Singapore 639798 © 2011 Nanyang Technological University. All rights reserved.
      DSpace software copyright © 2002-2015  DuraSpace
      Contact Us | Send Feedback
      Share |    
      Theme by 
      Atmire NV
       

       

      DCSIMG