August 25, 2014 | Marcos D. Assunção, Rodrigo N. Calheiros, Silvia Bianchi, Marco A. S. Netto, Rajkumar Buyya
This paper discusses approaches and environments for carrying out analytics on Clouds for Big Data applications. It focuses on four key areas: data management and supporting architectures, model development and scoring, visualisation and user interaction, and business models. Through a detailed survey, the paper identifies gaps in technology and provides recommendations for future research directions in Cloud-supported Big Data computing and analytics.
Big Data refers to the challenge of managing and gaining insights from vast amounts of data generated by organisations. Analytics solutions that mine structured and unstructured data are crucial for organisations to understand customer needs, predict demands, and optimise resource use. Despite the popularity of analytics and Big Data, implementing them is complex and time-consuming. Cloud computing offers flexibility by allowing organisations to pay only for the resources and services they use, but challenges remain in making Clouds an ideal platform for scalable analytics.
Cloud computing has revolutionised the IT industry by enabling organisations to use resources on a pay-as-you-go basis. However, several technical issues must be addressed, such as data management, model tuning, privacy, data quality, and data currency. The paper surveys existing work on solutions to provide analytics capabilities for Big Data on the Cloud, focusing on key issues in the phases of an analytics solution. It highlights challenges in data management, integration, and processing, and discusses existing models for data storage and retrieval, data diversity, velocity, and integration, as well as resource scheduling for data processing tasks.
The paper also discusses data storage solutions, including file systems like Google File System (GFS) and Amazon S3, and NoSQL databases. It explores data integration solutions, such as Birst and IVOCA, which help organisations manage and integrate data from multiple sources. The paper also covers data processing and resource management, including MapReduce and Hadoop, and discusses challenges in Big Data management, such as data variety, volume, and velocity.
In addition, the paper addresses model building and scoring, highlighting the importance of using data to build models for forecasting and prescriptions. It discusses open challenges in model building and scoring, such as the need for techniques that can explore the rapid elasticity and large scale of Cloud systems. The paper also explores visualisation and user interaction, noting the importance of good visualisation tools in facilitating navigation and understanding of data. It highlights the challenges of network bottlenecks in Cloud environments and the need for better interactive interfaces for Big Data analytics.This paper discusses approaches and environments for carrying out analytics on Clouds for Big Data applications. It focuses on four key areas: data management and supporting architectures, model development and scoring, visualisation and user interaction, and business models. Through a detailed survey, the paper identifies gaps in technology and provides recommendations for future research directions in Cloud-supported Big Data computing and analytics.
Big Data refers to the challenge of managing and gaining insights from vast amounts of data generated by organisations. Analytics solutions that mine structured and unstructured data are crucial for organisations to understand customer needs, predict demands, and optimise resource use. Despite the popularity of analytics and Big Data, implementing them is complex and time-consuming. Cloud computing offers flexibility by allowing organisations to pay only for the resources and services they use, but challenges remain in making Clouds an ideal platform for scalable analytics.
Cloud computing has revolutionised the IT industry by enabling organisations to use resources on a pay-as-you-go basis. However, several technical issues must be addressed, such as data management, model tuning, privacy, data quality, and data currency. The paper surveys existing work on solutions to provide analytics capabilities for Big Data on the Cloud, focusing on key issues in the phases of an analytics solution. It highlights challenges in data management, integration, and processing, and discusses existing models for data storage and retrieval, data diversity, velocity, and integration, as well as resource scheduling for data processing tasks.
The paper also discusses data storage solutions, including file systems like Google File System (GFS) and Amazon S3, and NoSQL databases. It explores data integration solutions, such as Birst and IVOCA, which help organisations manage and integrate data from multiple sources. The paper also covers data processing and resource management, including MapReduce and Hadoop, and discusses challenges in Big Data management, such as data variety, volume, and velocity.
In addition, the paper addresses model building and scoring, highlighting the importance of using data to build models for forecasting and prescriptions. It discusses open challenges in model building and scoring, such as the need for techniques that can explore the rapid elasticity and large scale of Cloud systems. The paper also explores visualisation and user interaction, noting the importance of good visualisation tools in facilitating navigation and understanding of data. It highlights the challenges of network bottlenecks in Cloud environments and the need for better interactive interfaces for Big Data analytics.