A framework for big sensor data collection and trading
Mohammad Abu Alsheikh
Date of Issue2017
School of Computer Science and Engineering
With the emerging sensing technologies such as wireless sensor networks and mobile crowdsourcing, data can be efficiently collected and used for analytics and optimization purposes. This has resulted in the recent big sensor data (BSD) era with applications in smart cities and Internet of things~(IoT). Software as a service (SaaS) and data as a service (DaaS) are cloud infrastructures required in providing BSD services to customers regardless of geographic and organizational boundaries among providers and customers. Many data systems have greatly matured which increase our ability to understand and make revenue out of BSD. First, the sensor networks and crowdsensing technologies have made it easy for individuals and firms to collect BSD. Second, big data platforms under the Hadoop ecosystem have simplified the processing of BSD in the cloud. As a result, BSD is now traded in online data marketplaces among data vendors and service providers. In this thesis, we present a framework for BSD collection and trading. Our framework includes a market model composed of a data vendor selling BSD to a service provider. The service provider trains machine learning models using the bought BSD and offers a service to customers. The thesis includes two major contributions. First, we consider the data collection from the data sources to the service provider. We present a data compression algorithm for preventing data congestion and reducing energy consumption of sensor devices. Our in-network approach can be easily tuned to analyze the data temporal or spatial correlation using an unsupervised neural network. Our algorithm extracts intrinsic data features from previously collected historical samples to transform the raw data into a low dimensional representation. Moreover, the proposed algorithm provides an error bound guarantee mechanism. Second, we address the problem of profit maximization of the service provider. Specifically, we introduce optimal pricing schemes for separate and bundled selling of BSD services. In the separate service selling, the service provider optimizes the requested data size and service's subscription fee to attain the maximum achievable profit. With service bundling, the bundle's subscription fee and requested data sizes of the grouped services are optimized to maximize the total profit of cooperative service providers. This thesis includes many important research results. First, experiments on real-world datasets show that our compression algorithm outperforms several well-known and traditional methods for data compression in sensor networks. The energy analysis shows that compressing the data can reduce the energy expenditure, and hence expand the service lifespan by several folds. For example, a compression ratio of 35.56% in 5-multihop transmissions reduces the overall energy consumption by 2.8 folds as compared to the raw data transmission. Second, we run extensive experiments using real-world datasets on finding the data utility in machine learning services. We observed that the recognition accuracy of the service increases as the requested data size increases and vice versa. Third, numerical experiments show the effectiveness of the proposed market models and pricing schemes in profit maximization by selling BSD services separately and as a bundle.
DRNTU::Engineering::Computer science and engineering::Computer systems organization::Computer-communication networks