An Overview of Data Warehousing and OLAP Technology

An Overview of Data Warehousing and OLAP Technology

1996 | Surajit Chaudhuri, Umeshwar Dayal
This paper provides an overview of data warehousing and OLAP (Online Analytical Processing) technologies, emphasizing their unique requirements. Data warehousing is a collection of decision support technologies aimed at enabling knowledge workers to make better and faster decisions. It is distinct from traditional online transaction processing (OLTP) applications, which focus on day-to-day operations. Data warehouses are designed for decision support, storing historical, summarized, and consolidated data, and are typically much larger than OLTP databases. They support complex queries and analysis, with a focus on query throughput and response times rather than transaction throughput. Data warehouses are typically implemented separately from operational databases due to their different requirements. They use multidimensional data models, which allow for complex analyses and visualization. Data in a warehouse is often modeled multidimensionally, with dimensions such as time, sales, and product. OLAP operations include rollup, drill-down, slicing, dicing, and pivoting. Data warehouses use various tools for data extraction, cleaning, and loading. These include data extraction tools, data cleaning tools, and load and refresh utilities. Data cleaning is essential to ensure the accuracy of data in the warehouse, as it is used for decision-making. Data warehouses may use ROLAP (Relational OLAP) or MOLAP (Multidimensional OLAP) servers, which store data in relational or multidimensional formats, respectively. The architecture of a data warehouse includes tools for extracting, cleaning, and loading data, as well as for refreshing the warehouse. It also includes warehouse servers, front-end tools for querying and analysis, and metadata management tools. Data warehouses may be distributed for scalability and availability, and may include departmental data marts. The design of a data warehouse involves defining the architecture, selecting storage servers, integrating servers and tools, designing the warehouse schema, and defining data placement and access methods. Data extraction, cleaning, and transformation are critical steps in the process, and data must be loaded and refreshed periodically. The paper also discusses research issues in data warehousing, including data cleaning, physical design, query optimization, and the management of materialized views. It highlights the challenges of efficiently processing complex queries, managing large volumes of data, and ensuring the performance and scalability of data warehouse systems. The paper concludes with a discussion of the importance of metadata management and the need for further research in these areas.This paper provides an overview of data warehousing and OLAP (Online Analytical Processing) technologies, emphasizing their unique requirements. Data warehousing is a collection of decision support technologies aimed at enabling knowledge workers to make better and faster decisions. It is distinct from traditional online transaction processing (OLTP) applications, which focus on day-to-day operations. Data warehouses are designed for decision support, storing historical, summarized, and consolidated data, and are typically much larger than OLTP databases. They support complex queries and analysis, with a focus on query throughput and response times rather than transaction throughput. Data warehouses are typically implemented separately from operational databases due to their different requirements. They use multidimensional data models, which allow for complex analyses and visualization. Data in a warehouse is often modeled multidimensionally, with dimensions such as time, sales, and product. OLAP operations include rollup, drill-down, slicing, dicing, and pivoting. Data warehouses use various tools for data extraction, cleaning, and loading. These include data extraction tools, data cleaning tools, and load and refresh utilities. Data cleaning is essential to ensure the accuracy of data in the warehouse, as it is used for decision-making. Data warehouses may use ROLAP (Relational OLAP) or MOLAP (Multidimensional OLAP) servers, which store data in relational or multidimensional formats, respectively. The architecture of a data warehouse includes tools for extracting, cleaning, and loading data, as well as for refreshing the warehouse. It also includes warehouse servers, front-end tools for querying and analysis, and metadata management tools. Data warehouses may be distributed for scalability and availability, and may include departmental data marts. The design of a data warehouse involves defining the architecture, selecting storage servers, integrating servers and tools, designing the warehouse schema, and defining data placement and access methods. Data extraction, cleaning, and transformation are critical steps in the process, and data must be loaded and refreshed periodically. The paper also discusses research issues in data warehousing, including data cleaning, physical design, query optimization, and the management of materialized views. It highlights the challenges of efficiently processing complex queries, managing large volumes of data, and ensuring the performance and scalability of data warehouse systems. The paper concludes with a discussion of the importance of metadata management and the need for further research in these areas.
Reach us at info@study.space
Understanding An overview of data warehousing and OLAP technology