In Transformer architectures, the _______ mechanism allows the model to focus on different parts of the input data differently.
- Self-Attention
- Batch Normalization
- Recurrent Layer
- Convolutional Layer
In Transformer architectures, the mechanism that allows the model to focus on different parts of the input data differently is known as "Self-Attention." It enables the model to weigh input elements based on their relevance for a given context.
Loading...
Related Quiz
- Which of the following databases is best suited for time-series data?
- Unlike traditional neural networks, RNNs have _______ that allows them to maintain a kind of memory from previous inputs.
- While training a deep neural network for a regression task, the model starts to memorize the training data. What's a suitable approach to address this issue?
- Apache Spark offers an optimized engine that supports _______ computations, enabling faster data analytics.
- In datasets with multiple features, the _______ plot can be used to visualize the relationship between variables and detect multivariate outliers.