Which SQL clause is used to filter the records returned from a SELECT query?
- FROM
- GROUP BY
- ORDER BY
- WHERE
The WHERE clause is used to filter records returned from a SELECT query in SQL. It allows you to specify conditions that the retrieved data must meet.
What is a 'fact table' in a data warehouse and how does it differ from a 'dimension table'?
- Fact table contains descriptive data, whereas dimension tables contain quantitative data.
- Fact table contains quantitative data and is connected to dimension tables, whereas dimension tables provide descriptive information about data in the fact table.
- Fact table is used for historical data, whereas dimension table is used for real-time data.
- Fact table is used for indexing, whereas dimension table is used for primary storage.
A 'fact table' in a data warehouse contains quantitative data and is connected to dimension tables, which provide descriptive information about the data in the fact table. The fact table is the core of the data warehouse and supports analytics.
What is the role of change data capture in ETL processes?
- Aggregating data for reporting purposes
- Capturing and tracking changes in source data over time
- Encrypting data during transfer
- Indexing data for faster retrieval
Change Data Capture (CDC) in ETL processes involves identifying and tracking changes in source data over time. This allows for the extraction of only the modified data, reducing processing time and ensuring data accuracy in the target system.
In BI tools, what is the purpose of a dashboard?
- Data Cleaning
- Data Encryption
- Data Storage
- Presenting Key Metrics
The purpose of a dashboard in BI tools is to present key metrics and insights in a visually accessible format. Dashboards provide a consolidated view of important information, making it easier for users to monitor performance and draw conclusions from the data.
In a real-time stock trading application, what algorithm would you use to ensure that you always get the best or optimal solution for stock price analysis?
- Bellman-Ford Algorithm
- Dijkstra's Algorithm
- Dynamic Programming
- Greedy Algorithm
A Greedy Algorithm is often used in real-time stock trading applications for optimal solutions. It makes locally optimal choices at each stage, aiming to find the global optimum. This is crucial for quickly making decisions in dynamic and time-sensitive environments. Dijkstra's Algorithm, Bellman-Ford Algorithm, and Dynamic Programming may not be as suitable for real-time stock price analysis.
In ETL, what is the significance of data staging?
- Direct loading of data into the target system
- Final storage of cleaned data
- Skipped phase in ETL process
- Temporary storage of raw data before transformation
Data staging in ETL is the temporary storage of raw data before it undergoes transformation. It allows for data validation, debugging, and auditing before the cleaned data is loaded into the target system.
In analyzing sales data for multiple regions, what visualization technique would best allow for the comparison of trends and patterns across different regions?
- Bar Charts
- Geographic Maps
- Line Charts
- Pie Charts
Geographic Maps are effective for visualizing sales data across different regions, allowing for a clear comparison of trends and patterns. Bar and Line Charts are useful for other types of comparisons, while Pie Charts are generally not recommended for regional comparisons.
In R, which function is used to read a CSV file?
- import.csv
- load.csv
- read.csv
- read_file
The read.csv function in R is used to read a CSV (Comma-Separated Values) file. It is a convenient function that reads the data from a CSV file and creates a data frame, making it easy to work with tabular data in R.
When executing data = {'a': 1, 'b': 2}; print(data.get(____, 'Not Found')), with a missing key, the output is "Not Found".
- 'Not Found'
- 'a'
- 'b'
- 'c'
The get method returns the value for the specified key or a default value if the key is not found. In this case, 'c' is not present, so it returns 'Not Found'.
Explain how 'git stash' is useful in managing changes.
- Apply changes from one branch to another.
- Create a backup of the entire repository.
- Permanently discard changes in the working directory.
- Temporarily save changes that are not ready to be committed, allowing for a clean working directory.
'Git stash' is a command that allows developers to temporarily save changes that are not yet ready to be committed. This is useful when switching between branches or addressing urgent issues, providing a way to store changes and revert to a clean working directory.