A self-driving car company has millions of images labeled with either "pedestrian" or "no pedestrian". They want the car to automatically detect pedestrians. Which type of learning and algorithm would be optimal for this task?

  • Supervised Learning with Convolutional Neural Networks
  • Unsupervised Learning with Apriori Algorithm
  • Reinforcement Learning with Monte Carlo Methods
  • Semi-Supervised Learning with DBSCAN
Supervised Learning with Convolutional Neural Networks (CNNs) is the optimal choice for image classification tasks like pedestrian detection. CNNs are designed for such tasks, while the other options are not suitable for image classification. Apriori is used for association rule mining, reinforcement learning for decision-making, and DBSCAN for clustering.

Apache Spark offers an optimized engine that supports _______ computations, enabling faster data analytics.

  • Batch
  • Single-threaded
  • Real-time
  • Static
Apache Spark offers an optimized engine that supports real-time computations. This capability enables faster data analytics by allowing Spark to process data as it arrives, making it suitable for real-time data processing and analytics tasks. This is a key advantage of Spark over traditional batch processing systems.

What is a potential consequence of biased algorithms in AI systems?

  • Improved accuracy
  • Enhanced user trust
  • Unfair or discriminatory outcomes
  • Faster data processing
Biased algorithms can lead to unfair or discriminatory outcomes, as they may favor certain groups over others. This can have significant ethical and legal implications, causing harm to individuals and undermining trust in AI systems.

In CNNs, the layers that preserve the spatial relationships between pixels by learning image features through small squares of input data are called _______ layers.

  • Pooling
  • Convolution
  • Fully Connected
  • Batch Normalization
In CNNs, the layers that preserve the spatial relationships between pixels by learning image features through small squares of input data are called "Convolution" layers. These layers apply convolutional operations to extract features from the input data, preserving the local spatial relationships in the image.

Which technology is NOT typically associated with real-time data processing?

  • Apache Kafka
  • Apache Spark
  • Hadoop MapReduce
  • MySQL
While Apache Kafka, Apache Spark, and Hadoop MapReduce are often used for real-time or near-real-time data processing, MySQL is a traditional relational database system that is not designed for real-time processing.

The _______ layer in a neural network is responsible for combining features across the input data, often used in CNNs.

  • Input
  • Hidden
  • Output
  • Convolutional
The blank should be filled with "Convolutional." Convolutional layers are used in Convolutional Neural Networks (CNNs) to combine features across input data by applying convolution operations. This is essential for tasks like image recognition.

In the context of model deployment, _______ is the process of ensuring the model's predictions remain consistent and accurate over time.

  • Monitoring
  • Training
  • ETL
  • Visualization
Model monitoring is the process of continuously tracking the performance and behavior of a deployed machine learning model. It involves checking for deviations, evaluating predictions against real-world data, and ensuring that the model remains accurate and reliable over time. Monitoring is crucial for maintaining model quality in production.

In unsupervised learning, _______ is a method where the objective is to group similar items into sets.

  • Principal Component Analysis
  • Regression Analysis
  • Hierarchical Clustering
  • Decision Trees
The correct term is "Hierarchical Clustering." In unsupervised learning, clustering is a method used to group similar items or data points into sets or clusters based on their similarities. Hierarchical clustering is one of the techniques for this purpose. It creates a tree-like structure (dendrogram) to represent the relationships between data points, making it easier to identify groups of similar items.

You're working for a company that generates vast amounts of log data daily. The company wants to analyze this data to gain insights into user behavior and system performance. Which Big Data tool would be most suitable for storing and processing this data efficiently?

  • Apache Hadoop
  • Apache Spark
  • Apache Kafka
  • Apache Cassandra
Apache Kafka is a distributed streaming platform that is well-suited for storing and processing large amounts of log data efficiently, making it a top choice for real-time data streaming and analysis.

Which activation function maps any input to a value between 0 and 1?

  • ReLU
  • Sigmoid
  • Tanh
  • Softmax
The sigmoid activation function maps any input to a value between 0 and 1. It's commonly used in neural networks for binary classification problems and helps introduce non-linearity in the network's computations.