A data warehouse acts as central storehouse or repository for most or significant parts of the data that an organization’s assortment of business systems generates. The term, coined by WH Inmon is a copy of transaction data specifically structured for the purpose of querying and reporting. The data warehouse is an authoritative storehouse of all the facts and dimension data at a minuscule level.
Generally, a data warehouse is attached on an organization’s mainframe server. Data is collected from a variety of online transaction processing (OLTP) applications and some other sources which are then selectively extracted and categorized on the data warehouse’s database for its use by analytical applications and user queries. This database is responsible for the generation and storage of all the information.
While data warehouses do vary in overall design, a bulk of them are subject oriented, which denotes that the stored information is linked to the events or objects that take place in reality. Also, the data presented by the data warehouse for analyzing contains information on a specific subject only, instead of the functions of the enterprise and is generated from an array of sources into one single unit having time-variant.
It is of utmost importance to note that with data warehousing, the data source systems are considered to be given. Although, the data source system may have been specifically made in a manner to make it hard to extract information, the data warehouse’s task is not to redesign the data source system but instead to build a consistent, integrated and consolidated data structure despite the apparent problems in the source systems. Data warehousing is able to achieve this by making use of different data warehousing techniques. It does this by creating single or more new data warehouses, whose data models aid the needed reporting and analysis.
Data warehousing lays stress on the procurement of data from varied sources for constructive analysis and access, but does not usually start from the point-of-view of the end user or knowledge worker, who may need to gain access to specialized or even local databases. The latter concept is commonly known as data mart.
A data mart is a database or a group of databases, intended to assist managers make strategic decisions regarding their business. It is a subset of an enterprise’s data store, usually designed for a specific purpose or a major data subject, which may be distributed to aid the enterprise as and when required.
Data mart is a decision support system integrated in a subset of the organization’s data focused on certain specific functions or activities of the business. They have designated business-related goals such as assessing the effectiveness of marketing promotions, or evaluating the influence of a new product’s launch on company’s profits, or determining and forecasting the performance of a freshly introduced company division.
Multiple data marts are possible within a single enterprise as long as each one is relevant to single or more business units for which it was created. It is not necessary for the data marts to be related or dependent on other data parts within the organization. However, if the data marts are constructed using conformed dimensions and facts, then they must be related. Several organizations consider each business unit or department as the owner of its data mart, which includes all of the hardware, software and its containing data. What this allows is each department to make use of, alter, augment and develop their data in the preferred manner. This is done ensuring that no information from other data marts or the central data warehouse is altered.
Usually, the database is created for a data mart around a star-join structure that is most advantageous for the requirements of the users of the department. The needs and requirements of the final users that is the employee’s of the department are collected so as to shape the star join. The data mart is generally housed in multidimensional technology which provides for great flexibility in analyzing data, although, it is not too effective for larger amounts of data. In addition, data located in the data marts are highly indexed. Hence, an enterprise can have multiple data marts fulfilling the requirements of marketing, sales, operations.
Data marts are of two types that are dependent and independent. Dependent data marts find their source in a data warehouse where as an independent data mart’s source is the legacy applications environment. While dependent data marts are both architecturally and structurally sound, independent data marts lack stability and are architecturally unsound in the long run. The problem that independent data marts face is that their shortcomings do not become clear until the enterprise has already built several independent data marts.
In actual practice, data mart and data warehouse both have a propensity to imply the presence of each other in some manner or the other. However, the design of a data mart usually begins with an analysis of the data already existing and a quest to find a suitable method to collect it, to ensure easy access of data as and when the need arises. A data warehouse combines together all the databases across the entire organization where as data marts are typically smaller in size and focus on a specific issue or department. A few data marts known as dependent data marts are logical subsets or physical subsets of a bigger data warehouses.
A data warehouse acts as a comprehensive aggregation of data which may be distributed physically, where as, a data mart is a repository of data that might or might not derive from a data warehouse and that lays emphasis on ease of access and suitability for a particular designed purpose. In a nutshell, a data warehouse is a strategic yet unfinished concept, while, a data mart is tactical and endeavors to meet an immediate need.
Creating a data mart offers many advantages to an enterprise such as easy and quick access to frequently required data as and when needed. It provides a combined view for a group of users, and also improves end-user’s reaction or response time. With relative ease in creation and its potential users being more clearly demarcated than in a complete data warehouse, data mart’s pros far outweigh the cons of it. What’s more, data mart has a proportionally lower cost than employing a whole data warehouse.
Data marts may entail substantial amounts of data, at times over hundreds of gigabytes, but they possess a substantial degree lesser quantity of data than a data warehouse developed for the same enterprise may hold. Furthermore, as data marts are aimed at specific business objectives, system planning and its requirement analysis are more manageable processes, and subsequently all the design, implementation, testing and installation are comparatively less expensive than the costs borne for data warehouses.
To sum up, data marts can be created and produced in a matter of months, and more importantly for only hundreds of thousands dollars instead of millions of dollars. Secondly due to its swift completion, they are able to produce models of success for the enterprise to use as a yardstick. Thirdly, individual departments are able to make use of their information effectively and efficiently in a relatively simple style.