CocoIndex now provides robust and flexible support for typed vector data — from simple numeric arrays to deeply nested multi-dimensional vectors. This support is designed for seamless integration with high-performance vector databases such as Qdrant, and enables advanced indexing, embedding, and retrieval workflows across diverse data modalities.

In this post, we’ll break down:


✅ Supported Python Vector Types

CocoIndex accepts a range of vector types with strong typing guarantees.

Note that CocoIndex automatically infer types, so if you’re defining a flow, you don’t need to explicitly specify any types like this. You only need to explicitly specify types when you define custom functions.

✅ 1. One-dimensional vectors


from cocoindex import Vector, Float32
from typing import Literal

Vector[Float32]                         # dynamic dimension
Vector[Float32, Literal[384]]          # fixed dimension of 384

These types are interpreted as:


✅ 2. Two-dimensional vectors (multi-vectors)


Vector[Vector[Float32, Literal[3]], Literal[2]]

This declares a vector with two rows of 3-element vectors. These are treated as multi-vectors (e.g., multiple embeddings per point).


🧠 What is a Multi-Dimensional Vector?

A multi-dimensional vector is simply a vector whose elements are themselves vectors — essentially a matrix or a nested list. In CocoIndex, we represent this using Vector[Vector[T, N], M], meaning M vectors, each of dimension N. M and N are optional - CocoIndex doesn't require them to be fixed, while some targets have requirements, e.g. a multi-vector exported to Qdrant needs to have a fixed inner dimension, i.e. Vector[Vector[T, N]].

This concept is crucial in deep learning and multimodal applications — for example:

Instead of flattening these rich representations, CocoIndex supports them natively through nested Vector typing.


🧭 When to Use Multi-Vector Embeddings

Here are scenarios where multi-vector embeddings shine:

🖼️ 1. Vision: Patch Embeddings

Imagine a 1024x1024 image processed by a Vision Transformer (ViT). Instead of one global image vector, ViT outputs a vector per patch (e.g. 8x8 patches → 64 vectors).

Use case:


Vector[Vector[Float32, Literal[768]]]

📄 2. Text: Document with Paragraphs

A long document split into paragraphs or sentences:


Vector[Vector[Float32, Literal[384]]]

You can now:

👤 3. User Behavior: Sessions or Time Series

Each user session can be viewed as a sequence of actions, clicks, or states:


Vector[Vector[Float32, Literal[16]]]

Useful in:

🧬 4. Scientific & Biomedical: Multi-view Embeddings

A molecule or protein might have multiple conformations, graphs, or projections:


Vector[Vector[Float32, Literal[128]]]

🔍 Why Not Just Use a Flat Vector?

You could flatten a 2x3 multi-vector like [[1,2,3],[4,5,6]] into [1,2,3,4,5,6] — but you’d lose semantic boundaries.

With true multi-vectors:


🧬 CocoIndex → Qdrant Type Mapping

Qdrant is a popular vector database that supports both dense vectors and multi-vectors. CocoIndex maps vector types to Qdrant-compatible formats as follows:

CocoIndex Type

Qdrant Type

Vector[Float32, Literal[N]]

Dense Vector

Vector[Vector[Float32, Literal[N]]]

MultiVector

Anything else

Stored as part of Qdrant’s JSON payload

Qdrant only accepts vectors with fixed dimension. CocoIndex automatically detects vector shapes and maps unsupported or dynamic-dimension vectors into payloads instead.


🧾 CocoIndex Data Model Recap

In CocoIndex, data is structured as rows and fields:

This hybrid approach gives you the flexibility of structured metadata with the power of vector search.


Final Thoughts

With the ability to support deeply typed vectors and multi-dimensional embeddings, CocoIndex brings structure and semantic clarity to vector data pipelines. Whether you're indexing images, text, audio, or abstract graph representations — our typing system ensures compatibility, debuggability, and correctness at scale.

Ready to bring your own LEGO to the vector world? Start building with CocoIndex today.