An AI startup with limited computational resources is building an image classifier. They don't have the capability to train a deep neural network from scratch. What approach can they use to leverage the capabilities of deep learning without the extensive training time?

  • Transfer learning
  • Reinforcement learning
  • Genetic algorithms
  • Random forest classifier
Transfer learning allows the startup to use pre-trained deep neural networks (e.g., a pre-trained CNN) as a starting point. This approach significantly reduces training time and computational resources, while still benefiting from the capabilities of deep learning.

A common architecture for real-time data processing involves using ________ to ingest and process streaming data.

  • Hadoop
  • Spark
  • Batch Processing
  • Data Lakes
In real-time data processing, Apache Spark is commonly used to ingest and process streaming data. Spark provides the capabilities to handle streaming data in real time, making it a popular choice for such applications.

In a skewed distribution, which measure of central tendency is most resistant to the effects of outliers?

  • Mean
  • Median
  • Mode
  • Geometric Mean
The median is the most resistant measure of central tendency in a skewed distribution. It is less affected by extreme values or outliers since it represents the middle value when the data is arranged in order. The mean, mode, and geometric mean can be heavily influenced by outliers, causing them to be less representative of the data's central location.

What is a common technique to prevent overfitting in linear regression models?

  • Increasing the model complexity
  • Reducing the number of features
  • Regularization
  • Using a smaller training dataset
Regularization is a common technique used to prevent overfitting in linear regression models. It adds a penalty term to the linear regression's cost function to discourage overly complex models. Regularization techniques include L1 (Lasso) and L2 (Ridge) regularization.

In which type of data do you often encounter a mix of structured tables and unstructured text?

  • Structured Data
  • Semi-Structured Data
  • Unstructured Data
  • Multivariate Data
Semi-structured data often contains a mix of structured tables and unstructured text. It's a flexible data format that can combine organized data elements with more free-form content, making it suitable for a wide range of data types and use cases, such as web data and NoSQL databases.

In transfer learning, a model trained on a large dataset is used as a starting point, and the knowledge gained is transferred to a new, _______ task.

  • Completely unrelated
  • Identical
  • Similar
  • Smaller-scale
In transfer learning, a model trained on a large dataset is used as a starting point, and the knowledge gained is transferred to a new, similar task. This leverages the pre-trained model's knowledge to improve performance on the new task, particularly when the tasks are related.

In Data Science, when dealing with large datasets that do not fit into memory, the Python library _______ can be a useful tool for efficient computations.

  • NumPy
  • Pandas
  • Dask
  • SciPy
When working with large datasets that do not fit into memory, the Python library "Dask" is a useful tool for efficient computations. Dask provides parallel and distributed computing capabilities, enabling data scientists to handle larger-than-memory datasets using familiar Python tools.

Which layer type in a neural network is primarily responsible for feature extraction and spatial hierarchy?

  • Input Layer
  • Convolutional Layer
  • Fully Connected Layer
  • Recurrent Layer
Convolutional Layers in neural networks are responsible for feature extraction and learning spatial hierarchies, making them crucial in tasks such as image recognition. They apply filters to the input data, capturing different features.

Which activation function can alleviate the vanishing gradient problem to some extent?

  • Sigmoid
  • ReLU (Rectified Linear Unit)
  • Tanh (Hyperbolic Tangent)
  • Leaky ReLU
The ReLU activation function is known for mitigating the vanishing gradient problem, which is a common issue in deep learning. ReLU allows gradients to flow more freely during backpropagation, making it easier to train deep neural networks.

In Tableau, you can connect to various data sources and create a unified view known as a _______.

  • Dashboard
  • Workbook
  • Storyboard
  • Data source
In Tableau, a "Workbook" is where you can connect to various data sources, design visualizations, and create a unified view of your data. It serves as a container for creating and organizing your data visualizations and analyses.