A
Systems Approach to Dimensional Modeling in Data Marts (White Paper No. One,
March 1997). This paper views dimensional modeling in data marts from the viewpoint of the
Fast Analysis of Shared Multidimensional Information (FASMI) definition of OLAP. FASMI
implies that dimensional data modeling supports measurement, causal, and structural
modeling, as well as a specification and organization of data comprehensive enough for
supporting KDD. The paper attempts to support this need by offering an approach to
dimensional modeling designed to support the development of business area models of system
dynamics. The approach requires highly explicit, top-down conceptualization and data
inventory steps in order to sketch out the broadest possible view of the outlines of the
range of measurement and cause-effect relations underlying data marts.
Data
Mining and KDD: A Shifting Mosaic (White Paper
No. Two, March 1997). Data Mining as a field is not yet through with the process of
definition and conceptualization of the scope of the field. There are at least three
distinct concepts of data mining being used by practitioners and vendors. This paper
defines the three concepts, associates them with three related concepts of Knowledge
Discovery in Databases (KDD), and argues that data mining is not automatic knowledge
discovery, and that the dream of making it so is, at best, an ideal motivating long-term
development.
Data Warehouses and Data Marts:
A Dynamic View (White Paper No. Three, March 1997). This paper explores three
patterns of data mart development and relationships with data warehouses: the top-down
model; the bottom-up model; and the parallel development model. All three models are seen
as unrealistic because they view development without explicit consideration of user
feedback and its impact on development. Three related models in the presence of user
feedback are then presented, their dynamics are discussed, and some predictions are made
about the likely popularity of each of the three feedback models in the future.
Evaluating OLAP Alternatives
(White Paper No. Four, March 1997). The rush to develop data warehouses and data marts has
gained considerable momentum from the presence of server-based On-line Analytical
Processing (OLAP) tools, including: Multidimensional Server-based (MDOLAP) tools; a number
of Relational OLAP (or ROLAP) products; and a new tool called Sybase IQ which uses a
technology we can call Vertical Technology OLAP (VTOLAP). How do we choose an OLAP product
for a data warehouse or data mart? This White Paper (a) reviews the three OLAP product
categories, and (b) provides a set of criteria for product evaluation in specific product
contexts.
Object-Oriented Data
Warehousing (White Paper No. Five, August 1997). Data warehousing has largely
developed with little or no reference toObject-Oriented Software Engineering (OOSE). This
is consistent with its development out of two-tier client/server relational database
methodology. As data warehouses increasingly are supplemented with data marts, with data
stores of diverse type and content, and with internet and intranet front-ends, the
two-tier client/server paradigm has given way to a multi-tier conceptual and software
framework characterized by distributed objects. Multi-tiered data warehousing needs to be
reconceptualized in terms of distributed objects and therefore in terms of OOSE. This
paper offers such a reconceptualization with a focus on dimensional data modeling and its
relation to object modeling.
Dimensional Object Modeling
(White Paper No. Seven, April 30, 1998). An object modeling approach offers advantages in
supporting Dimensional Data Modeling (DDM) of data warehouses and data marts. The current
approach to making the basic decisions in producing a DDM is a pragmatic one. The
pragmatic approach has had considerable commercial success, but it still makes tight
coupling of strategic goals and objectives to the DDM result a matter of art, rather than
a product of an explicit method or procedure, results in a model composed of passive
containers for data attributes, rather than components that combine both data and
behavior,does not place DDM within a broader framework for integrating data and process --
that is, the pragmatic approach is too data-centric, at a time when data warehousing is
concerned with integrating a complex diversity of server-based decision support system
functions. This paper examines the nature of DDM and DOM, develops the argument for tight
coupling of strategic goals and objectives to the DDM through an object modeling approach,
and discusses the advantages of the DOM approach in more detail.
Dimensional Modeling and E-R
Modeling in the Data Warehouse (White Paper No. Eight, June 22, 1998). While
there is consensus in the field of data warehousing on the desirability of using DM/star
schemas in developing data marts, there is an on-going controversy over the form of the
data model to be used in the data warehouse. The "Inmonites" contend that
the data warehouse should be developed using an E-R model. The "Kimballites"
believe that the data warehouse should always be modeled using a DM/star schema. Indeed
Kimball has stated that while DM/star schemas have the advantages of greater
understandability and superior performance relative to E-R models, their use involves no
loss of information, because any E-R model can be represented as a set of DM/star schema
models. This paper discusses two issues related to the controversy. First, the claim that
any E-R model can be represented as an equivalent set of DM/star schema models, and
second, the question of whether an E-R structured data warehouse, absent associative
entities, i.e. fact tables, is a viable concept, given recent developments in data
warehousing.
Architectural Evolution in Data
Warehousing (White Paper No. Eleven, July 1, 1998). This paper is concerned
with DSS/data warehouse system architectural evolution in response to the growing
complexity of the enterprise DSS environment and with the relationship of new
architectures to a developing capability to handle the Dynamic Integration Problem. The
paper briefly describes and analyzes the following architectures: Top-Down; Bottom-Up;
Enterprise Data Mart (EDM); Data Stage/Data Mart (DS/DM); Distributed Data Warehouse/Data
Mart (DDW/DM); Distributed Knowledge Management (DKM); Variations with introduction of the
ODS. In addition it comments on the relationship between DKM architecture and data mining
and provides some brief comments on software tools for implementing DKMA.
DKMS Briefs
The Corporate Information
Factory or the Corporate Knowledge Factory? (DKMS Brief No. One,
July 10, 1998). Bill Inmon has introduced the Corporate Information Factory. But
should he have introduced the Corporate Knowledge Factory? Does it really make any
difference?
Is Data Staging Relational? A
Comment (DKMS Brief No. Five, November 11, 1998). A recent question raised by
Ralph Kimball is whether the data staging area is relational or has more to do with
sequential processing of flat files. This brief revisits and expands on Kimball's
viewpoint. It examines the above issue from the viewpoint of the data staging application
server, the data staging repository, and the metadata and metamodel that drive the data
staging process.
Data Warehouses, Data Marts,
and Data Warehousing: New Definitions and New Conceptions (DKMS Brief No.
Six, November 12, 1998). Data Warehousing has lately been undergoing substantial changes
in architecture and a broadening of related functional applications. With these changes
have come new definitions of the Data Warehouse and evolving conceptions of data
warehousing This brief explores a number of data warehouse and data mart definitions and
their relation to the idea of the Distributed Knowledge Management System (DKMS). It also
analyzes the meaning of "data warehousing," in light of changes in data
warehousing systems and changes in definitions.
DKMA and The Data Warehouse Bus
Architecture (DKMS Brief No. Seven, November 13, 1998). The Data Warehouse
Bus Architecture is composed of "a master suite of conformed dimensions" and
standardized definitions of facts. Business process data marts throughout an enterprise
can "plug into" this bus to receive the dimension and fact tables they need. The
Bus supports the various processes and associated data marts that measure key aspects of
the processes. The logical union of these data marts is said to be the data warehouse. And
each data mart is said to be a subset of that data warehouse. This paper describes the
Data Warehouse Bus Architecture offered by Kimball, Reeves, Ross, and Thornthwaite, and
then contrasts it with DKM Architecture, an object-oriented alternative.
Presentations
Architectural
Evolution in Data Warehousing (September 9, 1998).