Distributed machine learning on public clouds
Tran, Manh Tu
Date of Issue2017-11-23
School of Computer Science and Engineering
Machine learning (ML) is prevalent in today’s world. Starting from the need to improve artificial intelligent, scientist attempted to have machines learning from data. Today, machine learning has been used in a wide range of applications from sciences such as bioinformatics, brain-machine interfaces, classifying DNA sequences to smart ecosystems with computer vision, natural language processing, speech recognition and many more. Due to its ease of adaptation, more and more people are starting to use its for their own use cases. However, they will need to have some good hardware as a start which not everyone can afford. Public cloud service providers are providing the infrastructure for machine learning at a more affordable cost such as EC2 from Amazon, Azure from Microsoft, GCE from Google. In the past when use cases of machine learning was limited and the amount of data available is small, machine learning can be done using a single machine. That is not the case in today’s world where exabytes of data are collected in a single day. Distributed machine learning arises as the solution for large scale machine learning. However, in distributed settings, machine learning does not scale linearly as more nodes are used. This project aims to construct predictive models to predict the optimal distributed settings for a sample dataset.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Final Year Project (FYP)
Nanyang Technological University