In L2 regularization, the penalty is proportional to the _______ of the magnitude of the coefficients.
- Square
- Absolute
- Exponential
- Logarithmic
In L2 regularization (Ridge), the penalty is proportional to the square of the magnitude of the coefficients. This regularization technique adds a penalty term to the loss function based on the sum of squared coefficients, which helps prevent overfitting by discouraging large coefficients.
Data formats that allow for tagging or marking up of data, giving it some level of organization, often fall into the _______ data category.
- Raw data
- Structured data
- NoSQL data
- XML data
Data formats like XML allow for tagging or marking up data, which gives it a structured organization. XML is an example of semi-structured data, where data elements are labeled, providing some level of organization.
Random Forest is an ensemble technique that builds multiple:
- Decision trees
- Linear regression models
- Neural networks
- Support vector machines
Random Forest is an ensemble method that builds multiple decision trees. These decision trees are used to make predictions and are combined to create a more accurate and robust model. Random Forest is known for its ability to reduce overfitting and provide reliable predictions, making it a popular choice in machine learning.
For large-scale image recognition tasks, which type of network, having multiple layers between input and output, is typically used?
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
- Long Short-Term Memory (LSTM) Network
- Gated Recurrent Unit (GRU) Network
Convolutional Neural Networks (CNNs) are commonly used in large-scale image recognition tasks. CNNs have multiple layers that can automatically learn hierarchical features from images, making them well-suited for this purpose.
In image processing, what technique is commonly used to detect edges in an image?
- Histogram Equalization
- Fourier Transform
- Canny Edge Detection
- K-Means Clustering
Canny Edge Detection is a widely used technique for edge detection in images. It applies multiple filters to detect edges with varying intensities, providing information about the location and strength of edges in the image.
What is the primary purpose of transfer learning in the context of deep learning for computer vision?
- Training a model from scratch
- Fine-tuning a pre-trained model
- Reducing the number of layers in a neural network
- Converting images into text
Transfer learning in computer vision involves fine-tuning a pre-trained model to adapt it for a new task. It leverages knowledge from a source task to improve performance on a target task, making it more efficient and effective than training from scratch.
When evaluating models for a multi-class classification problem, which method computes the average metric score for each class, considering the other classes as the negative class?
- Micro-averaging
- Macro-averaging
- Weighted averaging
- Mini-batch averaging
Macro-averaging computes the average metric score for each class, treating all other classes as the "negative" class. It provides an equal weight to each class and is useful when you want to assess the model's overall performance while giving equal importance to each class, regardless of class size. Macro-averaging can be particularly useful in imbalanced multi-class classification problems.
Which technique considers the spread of data points around the median to identify outliers?
- Box Plot
- Z-Score (Standardization)
- One-Hot Encoding
- K-Means Clustering
The Box Plot, also known as a box-and-whisker plot, considers the spread of data points around the median and helps identify outliers based on the interquartile range (IQR). Outliers are data points that fall outside the whiskers of the box plot. Z-Score is used for standardization, One-Hot Encoding is used for categorical variables, and K-Means Clustering is a clustering technique and not used for identifying outliers.
In Big Data processing, _______ operations filter and sort data, while _______ operations perform aggregations and transformations.
- Map, Reduce
- Filter, Join
- Shuffle, Merge
- Merge, Filter
In Big Data processing, the first blank should be filled with "Filter," and the second blank with "Join." Filtering and sorting are common operations in data preparation, while aggregations and transformations are typically done using join operations.
Which type of data requires more advanced tools and techniques for storage, retrieval, and processing due to its complexity and lack of structure?
- Structured Data
- Unstructured Data
- Semi-Structured Data
- Big Data
Unstructured data is typically more complex, lacking a fixed structure, and can include text, images, audio, and video. To handle such data, advanced tools and techniques like natural language processing, deep learning, and NoSQL databases are often required. Unstructured data poses challenges due to its variability and unpredictability.