Your task is to implement the function create_row_hv(row, dim, random_seeds)
to generate a composite hypervector for a given dataset row using Hyperdimensional Computing (HDC). Each feature in the row is represented by binding hypervectors for the feature name and its value. All feature hypervectors are then bundled to create a composite hypervector for the row.
Input:
row
: A dictionary representing a dataset row, where keys are feature names and values are their corresponding values.dim
: The dimensionality of the hypervectors.random_seeds
: A dictionary where keys are feature names and values are seeds to ensure reproducibility of hypervectors.Output:
Example: row = {"FeatureA": "value1", "FeatureB": "value2"} dim = 5 random_seeds = {"FeatureA": 42, "FeatureB": 7} print(create_row_hv(row, dim, random_seeds)) Output: [ 1, -1, 1, 1, 1]
Hyperdimensional Computing, HDC, is a computational model inspired by the brain's ability to represent and process information using high-dimensional vectors, based on hypervectors being quasi-orthogonal. It uses vectors with a large number of dimensions to represent data, where each vector is typically filled with binary, 1 or 0, or bipolar values, 1 or -1. To represent complex data pattern, binding and bundling operations are used. In HDC, different data types such as numeric and categorical variables are projected into high-dimensional space through specific encoding processes. Categorical variables are assigned unique hypervectors, often randomly generated binary or bipolar vectors, that serve as representations for each category. Numeric variables, are encoded by discretising the continuous values, and mapping discrete bins to hypervectors. These projections allow HDC models to integrate various data types into a unified high-dimensional representation, preserving information across complex, multi-feature datasets.
The binding operation between two hypervectors is performed element-wise using multiplication. This operation is used to represent associations between different pieces of information:
\[ \text{bind}(\text{hv1}, \text{hv2}) = \text{hv1} \times \text{hv2} \]Where \( \text{hv1} \) and \( \text{hv2} \) are bipolar vectors, and their element-wise multiplication results in a new vector where each element is either 1 or -1.
The bundling operation sums multiple hypervectors to combine information, typically using element-wise addition for bipolar vectors and XOR operations for binary vectors. This operation aggregates information and creates a composite hypervector that represents the overall data or concept. For example, for a set of \( n \) hypervectors \( \text{hv1}, \text{hv2}, \dots, \text{hvn} \), the bundled vector is:
\[ \text{bundle}(\text{hv1}, \text{hv2}, \dots, \text{hvn}) = \sum_{i=1}^{n} \text{hvi} \]This bundled vector is then normalized to ensure it remains bipolar.
Normalization ensures that the final bundled vector contains only bipolar or binary values. The normalization function typically applies a thresholding process that transforms any value greater than zero to +1 and any value less than zero to -1. Zero values are then typically assigned to either +1 or -1.
Consider a scenario where we want to represent and combine information from each feature in a row of a dataset. Each feature, whether numeric or categorical, is represented by a hypervector, and these hypervectors are combined to form a composite vector that represents the entire row of data.
For instance, if we have a dataset row with features Feature A and Feature B, we would:
Hyperdimensional computing has a variety of applications, including:
import numpy as np def create_hv(dim): return np.random.choice([-1, 1], dim) def create_col_hvs(dim, seed): np.random.seed(seed) return create_hv(dim), create_hv(dim) def bind(hv1, hv2): return hv1 * hv2 def bundle(hvs, dim): bundled = np.sum(list(hvs.values()), axis=0) return sign(bundled) def sign(vector, threshold=0.01): return np.array([1 if v >= 0 else -1 for v in vector]) def create_row_hv(row, dim, random_seeds): row_hvs = {col: bind(*create_col_hvs(dim, random_seeds[col])) for col in row.keys()} return bundle(row_hvs, dim)
There’s no video solution available yet 😔, but you can be the first to submit one at: GitHub link.