Deep Learning Interview Prep Course | Full Course [100 Q&A's]
About the Author: LunarTech
LunarTech is an online tech academy focused on data science, machine learning, and quantitative analysis — covering both the theoretical foundations (linear algebra, calculus, statistics) and the practical Python / SQL toolchain that working data scientists use. The school operates globally with cohort-based and self-paced tracks.
The CourseFlix listing carries twelve LunarTech courses spanning machine-learning theory, deep learning, applied data-science workflows, and the math fundamentals underlying the field. Material is paid and aimed at engineers and analysts transitioning into formal data-science roles or upskilling within them.
Watch Online 100 lessons
| # | Lesson Title | Duration | Access |
|---|---|---|---|
| 1 | Q1 - What is Deep Learning? Demo | 04:07 | |
| 2 | Q2 - What is Deep Learning? | 04:14 | |
| 3 | Q3 - What is a Neural Network? | 07:14 | |
| 4 | Q4 - Explain the concept of a neuron in Deep Learning. | 03:34 | |
| 5 | Q5 - Explain architecture of Neural Networks in simple way | 07:53 | |
| 6 | Q6 - What is an activation function in a Neural Network? | 04:01 | |
| 7 | Q7 - Name few popular activation functions and describe them | 13:43 | |
| 8 | Q8 - What happens if you do not use any activation functions in a NN? | 01:27 | |
| 9 | Q9 - Describe how training of basic Neural Networks works | 05:53 | |
| 10 | Q10 - What is Gradient Descent? | 10:41 | |
| 11 | Q11 - What is the function of an optimizer in Deep Learning? | 05:34 | |
| 12 | Q12 - What is backpropagation, and why is it important in Deep Learning? | 08:39 | |
| 13 | Q13 - How is backpropagation different from gradient descent? | 03:05 | |
| 14 | Q14 - Describe what Vanishing Gradient Problem is and it’s impact on NN | 07:01 | |
| 15 | Q15 - Describe what Exploding Gradients Problem is and it’s impact on NN | 08:31 | |
| 16 | Q16 - There is a neuron results in a large error in backpropagation. Reason? | 04:40 | |
| 17 | Q17 - What do you understand by a computational graph? | 06:18 | |
| 18 | Q18 - What is Loss Function and what are various Loss functions used in DL? | 06:39 | |
| 19 | Q19 - What is Cross Entropy loss function and how is it called in industry? | 03:41 | |
| 20 | Q20 - Why is Cross-entropy preferred as cost function for multi-class classification? | 03:40 | |
| 21 | Q21 - What is SGD and why it’s used in training Neural Networks? | 06:11 | |
| 22 | Q22 - Why does stochastic gradient descent oscillate towards local minima? | 05:52 | |
| 23 | Q23: How is GD different from SGD | 05:19 | |
| 24 | Q24: What is SGD with Momentum | 06:04 | |
| 25 | Q25 - Batch Gradient Descent, Minibatch Gradient Descent vs SGD | 05:27 | |
| 26 | Q26: What is impact of Batch Size | 06:49 | |
| 27 | Q27: Batch Size vs Model Performance | 04:10 | |
| 28 | Q28: What is Hessian, usage in DL | 04:39 | |
| 29 | Q29: What is RMSProp and how does it work? | 05:29 | |
| 30 | Q30: What is Adaptive Learning | 04:33 | |
| 31 | Q31: What is Adam Optimizer | 07:03 | |
| 32 | Q32: What is AdamW Algorithm in Neural Networks | 04:53 | |
| 33 | Q33: What is Batch Normalization | 08:32 | |
| 34 | Q34: What is Layer Normalization | 03:39 | |
| 35 | Q35: What are Residual Connections | 09:23 | |
| 36 | Q36: What is Gradient Clipping | 03:41 | |
| 37 | Q37: What is Xavier Initialization | 04:05 | |
| 38 | Q38: What are ways to solve Vanishing Gradients | 03:16 | |
| 39 | Q39: How to solve Exploding Gradient Problem | 01:12 | |
| 40 | Q40: What is Overfitting | 02:39 | |
| 41 | Q41: What is Dropout | 05:19 | |
| 42 | Q42: How does Dropout prevent Overfitting in Neural Networks | 00:42 | |
| 43 | Q43: Is Dropout like Random Forest | 04:42 | |
| 44 | Q44: What is the impact of DropOut on the training vs testing | 02:36 | |
| 45 | Q45: What are L2 and L1 Regularizations for Overfitting NN | 03:19 | |
| 46 | Q46: What is the difference between L1 and L2 Regularisations | 04:05 | |
| 47 | Q47: How do L1 vs L2 Regularization impact the Weights in a NN? | 01:52 | |
| 48 | Q48: What is the Curse of Dimensionality in Machine Learning | Deep Learning Interview Question | 02:28 | |
| 49 | Q49 - How Deep Learning models tackle the Curse of Dimensionality | Deep Learning Interview Question | 04:05 | |
| 50 | Q50: What are Generative Models, give examples? | 02:58 | |
| 51 | Q51 - What are Discriminative Models, give examples? | 03:04 | |
| 52 | Q52 - What is the difference between generative and discriminative models? | 08:35 | |
| 53 | Q53 - What are Autoencoders and How Do They Work? | 04:31 | |
| 54 | Q54: What is the Difference Beetween Autoenconders and other Neural Networks? | 04:32 | |
| 55 | Q55 - What are some popular autoencoders, mention few? | 01:25 | |
| 56 | Q56 - What is the role of the Loss function in Autoencoders, & how is it different from other NN? | 01:04 | |
| 57 | Q57 - How do autoencoders differ from (PCA)? | 02:21 | |
| 58 | Q58 - Which one is better for reconstruction linear autoencoder or PCA? | 03:27 | |
| 59 | Q59 - How can you recreate PCA with neural networks? | 06:31 | |
| 60 | Q60 - Can You Explain How Autoencoders Can be Used for Anomaly Detection? | 10:36 | |
| 61 | Q61 - What are some applications of AutoEncoders | 02:20 | |
| 62 | Q62 - How can uncertainty be introduced into Autoencoders, & what are the benefits and challenges of doing so? | 04:09 | |
| 63 | Q63 - Can you explain what VAE is and describe its training process? | 03:18 | |
| 64 | Q64 - Explain what Kullback-Leibler (KL) divergence is & why does it matter in VAEs? | 03:48 | |
| 65 | Q65 - Can you explain what reconstruction loss is & it’s function in VAEs? | 01:02 | |
| 66 | Q66 - What is ELBO & What is this trade-off between reconstruction quality & regularization? | 04:35 | |
| 67 | Q67 - Can you explain the training & optimization process of VAEs? | 03:49 | |
| 68 | Q68 - How would you balance reconstruction quality and latent space regularization in a practical Variational Autoencoder implementation? | 03:12 | |
| 69 | Q69 - What is Reparametrization trick and why is it important? | 04:15 | |
| 70 | Q70 - What is DGG "Deep Clustering via a Gaussian-mixture Variational Autoencoder (VAE)” with Graph Embedding | 01:45 | |
| 71 | Q71 - How does a neural network with one layer and one input and output compare to a logistic regression? | 02:24 | |
| 72 | Q72 - In a logistic regression model, will all the gradient descent algorithms lead to the same model if run for a long time? | 01:06 | |
| 73 | Q73 - What is a Convolutional Neural Network? | 05:10 | |
| 74 | Q74 - What is padding and why it’s used in Convolutional Neural Networks (CNNs)? | 02:02 | |
| 75 | Q75 - Padded Convolutions: What are Valid and Same Paddings? | 13:18 | |
| 76 | Q76 - What is stride in CNN and why is it used? | 05:43 | |
| 77 | Q77 - What is the impact of Stride size on CNNs? | 02:28 | |
| 78 | Q78 - What is Pooling, what is the intuition behind it and why is it used in CNNs? | 09:12 | |
| 79 | Q79 - What are common types of pooling in CNN? | 02:49 | |
| 80 | Q80 - Why min pooling is not used? | 03:47 | |
| 81 | Q81 - What is translation invariance and why is it important? | 01:36 | |
| 82 | Q82 - How does a 1D Convolutional Neural Network (CNN) work? | 02:55 | |
| 83 | Q83 - What are Recurrent Neural Networks, and walk me through the architecture of RNNs. | 07:09 | |
| 84 | Q84 - What are the main disadvantages of RNNs, especially in Machine Translation Tasks? | 01:30 | |
| 85 | Q85 - What are some applications of RNN? | 06:16 | |
| 86 | Q86 - What technique is commonly used in RNNs to combat the Vanishing Gradient Problem? | 05:05 | |
| 87 | Q87 - What are LSTMs and their key components? | 05:24 | |
| 88 | Q88 - What limitations of RNN that LSTMs do and don’t address and how? | 06:16 | |
| 89 | Q89 - What is a gated recurrent unit (GRU) and how is it different from LSTMs? | 03:35 | |
| 90 | Q90 - Describe how Generative Adversarial Networks (GANs) work and the roles of the generator and discriminator in learning. | 06:17 | |
| 91 | Q91 - Describe how would you use GANs for image translation or creating photorealistic images? | 04:10 | |
| 92 | Q92 - How would you address mode collapse and vanishing gradients in GAN training, and what is their impact on data quality? | 04:12 | |
| 93 | Q93- Minimax and Nash Equilibrium in GAN | 09:04 | |
| 94 | Q94 - What are token embeddings and what is their function? | 06:01 | |
| 95 | Q95 - What is self-attention mechanism? | 11:26 | |
| 96 | Q96 - What is Multi-Head Self-Attention and how does it enable more effective processing of sequences in Transformers? | 06:54 | |
| 97 | Q97 - What are transformers and why are they important in combating problems of models like RNN and LSTMs? | 05:52 | |
| 98 | Q98 - Walk me through the architecture of transformers. | 08:56 | |
| 99 | Q99 - What are positional encodings and how are they calculated? | 05:48 | |
| 100 | Q100 - Why do we add positional encodings to Transformers but not to | 02:13 |
Get instant access to all 99 lessons in this course, plus thousands of other premium courses. One subscription, unlimited knowledge.
Learn more about subscriptionRelated courses
-
Updated 1y agoThe Ultimate Design Patterns: Part 1
By: Mosh Hamedani (Code with Mosh)If you want to reach the higher levels of software engineering, you simply MUST master design patterns. It’s a no-brainer! Most employers are looking for senior4h 3m5/5 -
Updated 2y agoHack the Tech Interview (The Pro Package)
By: Randall KannaThe course is an intensive bootcamp aimed at successfully passing programming interviews and securing a high-paying developer job.7h 5m -
Updated 6mo agoObject-Oriented Design Interview1
By: ByteByteGo (Alex Xu)Interviews on Object-Oriented Design (OOD) are becoming increasingly popular in technical hiring.5/5