In data preprocessing, what does 'normalization' refer to?
- Data imputation
- Handling categorical data
- Removing outliers
- Scaling numerical features to a standard range
Normalization in data preprocessing refers to scaling numerical features to a standard range, often between 0 and 1. This ensures that different features with different scales contribute equally to the analysis, preventing one feature from dominating the others.
What is the primary difference between SOAP and REST APIs in terms of their communication protocols?
- REST requires a pre-defined contract, while SOAP does not.
- SOAP is only used in web applications, while REST is used in mobile applications.
- SOAP is stateless, while REST is stateful.
- SOAP uses XML for message formatting, while REST typically uses JSON.
The primary difference is in their message formatting; SOAP uses XML, while REST typically uses JSON. Additionally, REST is stateless, meaning each request from a client contains all the information needed, while SOAP can be stateful or stateless.
How does responsive design impact the development of a dashboard for multiple devices?
- It ensures the dashboard layout adapts to different screen sizes, maintaining usability.
- It focuses on enhancing visual appeal at the expense of functionality.
- It increases the development time without providing any significant benefits.
- It restricts the dashboard to a specific device, limiting accessibility.
Responsive design ensures that a dashboard is user-friendly across various devices by adapting its layout to different screen sizes. This improves accessibility and user experience across a range of devices.
_________ in data governance refers to the policies and processes ensuring data integrity and security.
- Data Management
- Data Privacy
- Data Quality
- Data Stewardship
Data Stewardship in data governance refers to the policies and processes ensuring data integrity and security. It involves the responsible management and oversight of data to maintain its quality and protect its confidentiality and integrity.
What is the purpose of the VLOOKUP function in Excel?
- Calculating the average of a range of cells.
- Counting the number of non-empty cells in a range.
- Retrieving data from a different table based on a specified column and row index.
- Sorting data in ascending order.
The VLOOKUP function in Excel is used to retrieve data from a different table based on a specified column and row index. It is particularly useful for looking up values in large datasets and extracting relevant information.
The _________ feature in Power BI allows for the creation of complex data models and relationships.
- DAX
- Data Modeling
- ETL
- Power Query
The Data Modeling feature in Power BI allows users to create complex data models and establish relationships between different tables. This is essential for analyzing and visualizing data effectively. Power Query is used for data transformation, DAX (Data Analysis Expressions) is a formula language, and ETL (Extract, Transform, Load) is a broader process that includes data integration and transformation.
In project management, what tool is commonly used to visually track the progress of project tasks?
- Gantt Chart
- Histogram
- Pie Chart
- Scatter Plot
A Gantt Chart is commonly used in project management to visually track the progress of project tasks over time. It provides a timeline view of tasks, their dependencies, and their current status. Scatter plots, pie charts, and histograms are useful for other types of data visualization but are not typically used for tracking project tasks.
What is the primary purpose of a scatter plot in data visualization?
- Comparing multiple categories in a dataset
- Displaying the distribution of a single variable
- Representing data in chronological order
- Showing the relationship between two variables
A scatter plot is used to visualize the relationship between two variables. Each point on the plot represents a pair of values, allowing for the identification of patterns or correlations between the variables.
How does a data catalog contribute to effective data governance?
- It focuses on data encryption to ensure security.
- It is used for primary data storage.
- It primarily deals with data visualization techniques.
- It provides a centralized repository for storing and managing metadata.
A data catalog contributes to effective data governance by serving as a centralized repository for storing and managing metadata. Metadata includes information about the data, such as its origin, structure, and usage, which is crucial for ensuring data quality and compliance with governance policies.
What is Hadoop primarily used for in Big Data technologies?
- Data Storage and Processing
- Data Visualization
- Machine Learning
- Real-time Analytics
Hadoop is primarily used for distributed storage and processing of large volumes of data. It enables the distributed processing of data across clusters, making it suitable for tasks like batch processing and analytics.