How are fact tables typically connected to dimension tables in a snowflake schema?

  • Through a circular relationship
  • Through direct one-to-one relationships
  • Through indirect relationships via intermediary tables
  • Without any connections
In a snowflake schema, fact tables are typically connected to dimension tables through indirect relationships via intermediary tables. This means that dimension tables may be normalized and broken down into sub-dimensions, creating a more structured and normalized data model.

An index that contains a subset of the columns of a table and includes only the columns frequently used in queries is known as a _______ index.

  • Composite
  • Covering
  • Dense
  • Sparse
A covering index is one that includes a subset of the columns of a table, often those frequently used in queries. This type of index can significantly improve query performance by allowing the database engine to fulfill queries using the index alone, without accessing the actual table.

What is the purpose of using inheritance in database modeling?

  • Achieving data consistency
  • Enhancing query performance
  • Reducing redundancy and improving data integrity
  • Simplifying database design
The purpose of using inheritance in database modeling is to reduce redundancy and improve data integrity. It allows attributes and relationships to be shared among entities, minimizing duplication and ensuring consistency across the database.

Scenario: A large corporation is planning to implement a centralized repository for historical data analysis. Which approach, Data Warehouse or Data Mart, would be more suitable for their needs and why?

  • Both Data Warehouse and Data Mart
  • Data Mart
  • Data Warehouse
  • Neither Data Warehouse nor Data Mart
For a large corporation needing centralized historical data analysis, a Data Warehouse is more suitable. A Data Warehouse allows the organization to integrate data from various sources into a single, unified repository, providing a comprehensive view for analytical purposes. It supports complex queries and reporting across the entire organization's data.

When would you use specialization instead of a regular entity in data modeling?

  • When all entities have a common set of attributes
  • When entities have identical attributes
  • When there is a need for a one-to-many relationship
  • When there is a need to represent unique attributes of a subset of entities
Specialization is used when there is a need to represent unique attributes of a subset of entities within a general category. It allows for the creation of more specific entity types with additional attributes.

The process of defining common attributes and behaviors for a group of entities is called _______.

  • Aggregation
  • Association
  • Generalization
  • Specialization
The process of defining common attributes and behaviors for a group of entities is called Generalization. This involves identifying shared characteristics among entities to create a more abstract, generalized entity.

What is the purpose of surrogate keys in fact tables?

  • To establish foreign key relationships
  • To improve query performance
  • To provide a stable and unique identifier for each row in the fact table
  • To replace natural keys in dimension tables
The purpose of surrogate keys in fact tables is to provide a stable and unique identifier for each row. Surrogate keys are often integers generated by the system and do not have any business meaning. They ensure data integrity and make it easier to manage relationships between tables.

How do ER diagram tools assist expert users in forward and reverse engineering processes?

  • By automating the creation of entities
  • By generating SQL scripts from diagrams
  • By providing pre-built templates for databases
  • By suggesting naming conventions
ER diagram tools assist expert users in forward and reverse engineering by generating SQL scripts from diagrams. This allows for efficient database creation and modification based on the visual representation of the entity-relationship model.

In which scenario would you prefer range-based partitioning over hash-based partitioning?

  • When data distribution is skewed
  • When the query workload involves range queries
  • When there are frequent insertions and deletions
  • When there is a need for data redundancy
Range-based partitioning is preferred over hash-based partitioning when the query workload involves range queries. This allows the database to quickly identify and access the specific partition containing the required range of data, optimizing query performance.

What measures can be taken at the application level to ensure data integrity in a distributed system?

  • Avoidance of data encryption
  • Centralized data storage
  • Lack of data validation
  • Use of distributed transactions
At the application level in a distributed system, ensuring data integrity involves using distributed transactions. This allows multiple operations across different nodes to be treated as a single transaction, ensuring that either all operations succeed or none do. This prevents inconsistencies that may arise from partial updates. Other measures include robust data encryption, centralized data storage for consistency, and thorough data validation to avoid corrupt or inaccurate data.