What strategies can be employed to optimize indexing for large-scale databases?
- Avoid indexing altogether for large-scale databases
- Choose appropriate column(s) for indexing
- Regularly rebuild all indexes
- Use fewer indexes to minimize overhead
Optimizing indexing for large-scale databases involves choosing appropriate columns for indexing, considering the query patterns. It's essential to strike a balance between query performance and maintenance overhead.
Scenario: A university has students and courses. Each student can enroll in multiple courses, and each course can have multiple students enrolled in it. What type of entity would you introduce to represent the relationship between students and courses in an ERD?
- Association entity
- Composite entity
- Derived entity
- Intersection entity
In this case, introducing an Intersection entity (or associative entity) is suitable. It represents the many-to-many relationship between students and courses and stores additional attributes related to the enrollment, such as enrollment date or grades.
Which of the following techniques can be employed for database performance tuning?
- Data isolation
- Data replication
- Data validation
- Denormalization
Denormalization is one of the techniques employed for database performance tuning. It involves intentionally introducing redundancy into a database schema to improve read performance by reducing the need for joins and simplifying data retrieval operations.
In database performance tuning, _______ is the process of rearranging the way data is stored to improve query performance.
- Clustering
- Denormalization
- Partitioning
- Sharding
In database performance tuning, clustering is the process of rearranging the way data is stored to improve query performance. Clustering involves storing related data together physically on the disk, which can reduce disk I/O and improve query speed.
Scenario: A database has a table with attributes {EmployeeID, ProjectID, HoursWorked}. Is this table in the third normal form (3NF)?
- Cannot be determined
- No
- Not applicable
- Yes
Yes
What is the primary focus of Dimensional Modeling?
- Data Integrity
- Normalization
- Performance for retrieval and analysis
- Transaction processing
The primary focus of Dimensional Modeling is optimizing performance for retrieval and analysis. Unlike normalization, which aims for data integrity through minimizing redundancy, Dimensional Modeling prioritizes efficient querying and reporting for analytical purposes. This involves designing structures that align with how users typically access and analyze data in a data warehouse.
Which factor is typically NOT considered when deciding how to partition data?
- Data compression ratio
- Data distribution across servers
- Query performance requirements
- Security requirements
The data compression ratio is typically not considered when deciding how to partition data. Partitioning decisions are primarily based on factors such as data distribution, query performance, and security requirements, but compression considerations are addressed separately.
In Forward Engineering, the process starts with a _______ data model and progresses towards a detailed physical model.
- Abstract
- Conceptual
- Concrete
- Logical
In Forward Engineering, the process begins with a Logical Data Model. This model represents the abstract structure of the data without concerning itself with the physical implementation. It serves as a bridge between the high-level conceptual model and the detailed physical model.
Scenario: A hospital manages doctors, patients, and appointments. Each patient can have multiple appointments, each doctor can have multiple appointments, and each appointment is associated with one patient and one doctor. How would you represent this scenario in an ERD?
- Many-to-Many
- Many-to-One
- One-to-Many
- One-to-One
For this scenario, a One-to-One relationship is appropriate. Each appointment is associated with one patient and one doctor. It ensures that each appointment is uniquely linked to a specific patient and doctor, avoiding data redundancy.
In NoSQL databases, which consistency model sacrifices consistency in favor of availability and partition tolerance?
- Causal Consistency
- Eventual Consistency
- Sequential Consistency
- Strong Consistency
Eventual Consistency in NoSQL databases sacrifices immediate consistency in favor of high availability and partition tolerance. It allows replicas of data to become consistent over time, ensuring that all replicas will eventually converge to the same value. This trade-off is suitable for systems where availability is crucial, and temporary inconsistencies can be tolerated.