Example: X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]) y = np.array([1, 2, 3, 4, 5]) batch_size = 2 batch_iterator(X, y, batch_size) output: [[[[1, 2], [3, 4]], [1, 2]], [[[5, 6], [7, 8]], [3, 4]], [[[9, 10]], [5]]] Reasoning: The dataset X contains 5 samples, and we are using a batch size of 2. Therefore, the function will divide the dataset into 3 batches. The first two batches will contain 2 samples each, and the last batch will contain the remaining sample. The corresponding values from y are also included in each batch.
Batch iteration is a common technique used in machine learning and data processing to handle large datasets more efficiently. Instead of processing the entire dataset at once, which can be memory-intensive, data is processed in smaller, more manageable batches.
Here's a step-by-step method to create a batch iterator:
This method ensures efficient processing and can be used for both training and evaluation phases in machine learning workflows.
import numpy as np def batch_iterator(X, y=None, batch_size=64): n_samples = X.shape[0] batches = [] for i in np.arange(0, n_samples, batch_size): begin, end = i, min(i+batch_size, n_samples) if y is not None: batches.append([X[begin:end], y[begin:end]]) else: batches.append( X[begin:end]) return batches
There’s no video solution available yet 😔, but you can be the first to submit one at: GitHub link.