Personal tools

Data Warehouses

Space_Shuttle_Endeavor_1
(Space Shuttle Endeavor's Final Journey to LAX - Jeff M. Wang)

 

Data Warehouse

 

A data warehouse (DW), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. The system’s logical design facilitates the integration of data sources and allows the generation of new, additional valuable data sources without significant structural adjustment.  

Each organization has distinct operation practices and business models, which result in a variety of data generation platforms. Ultimately, a data warehouse should be larger than the sum of its data, and serve as an ongoing intelligent resource for use by multiple members of an organization, large or small. For that to happen, data warehouse technologies require data virtualization, processing, and transformation methods. 

The are several delivery models, including physical appliances, such as dedicated traditional storage subsystems built to support analytics and business performance (BI) (BI is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance). With the addition and ongoing evolution of the cloud, cloud-based solutions, seen as agile and low capital intensive solutions, aim to simplify both the hosting of and analysis of data in an increasingly complicated environment. 

In addition to the explosive growth in the amount of data and data sources we’ve seen in recent years, another motivation for creating even more sophisticated data warehousing systems is the ever-increasing need for customizable business intelligence and analytics. 

 

Data Integration

 

Whatever your big data application is, and the types of big data you are using the real value will come from integrating different types of data sources, and analyzing them at scale. 

Data integration means bringing together data from diverse sources and turning them into coherent and more useful information (or knowledge). The main objective here is taming or more technically managing data and turning it into something you can make use of programmatically. A data integration process involves many parts. It starts with discovering, accessing, and monitoring data and continues with modeling and transforming data from a variety of sources. Moreover, integration of diverse datasets significantly reduces the overall data complexity. The data becomes more available for use and unified as a system of its own. Such a streamlined and integrated data system can increase the collaboration between different parts of your data systems. Each part can now clearly see how their data is integrated into the overall system, including the user scenarios and the security and privacy processes around it. 

 

[More to come ...]



   

 
Document Actions