I'm trying to make a CNN in python using numpy. I have a finished product but it seems that it can be improved. On testing the convolutional layer is the biggest bottleneck
def forward(self, a_prev, training):
batch_size = a_prev.shape[0]
a_prev_padded = Conv.zero_pad(a_prev, self.pad)
out = np.zeros((batch_size, self.n_h, self.n_w, self.n_c))
# Convolve
for i in range(self.n_h):
v_start = i * self.stride
v_end = v_start + self.kernel_size
for j in range(self.n_w):
h_start = j * self.stride
h_end = h_start + self.kernel_size
out[:, i, j, :] = np.sum(
a_prev_padded[:, v_start:v_end, h_start:h_end, :, np.newaxis] * self.w[np.newaxis, :, :, :],
axis=(1, 2, 3),
)
z = out + self.b
a = self.activation.f(z)
if training:
# Cache for backward pass
self.cache.update({"a_prev": a_prev, "z": z, "a": a})
return a
This is the code I have written. The nested for loops are one of my concerns but not sure. Is there a way to speed this up or should I use another algorithm?
Thanks in advance