Le Monde Web

LinkedCulture

Museum Collections Search Using Vector Embeddings

This prototype investigates the application of vector embeddings to enhance semantic search across cultural heritage metadata.

Research Context

Leveraging open access collections from The Met, the Smithsonian, and the Harvard Museum, object metadata is transformed into high-dimensional vector representations using the nomic-embed-text:latest model served via Ollama and indexed within a Qdrant vector store.

Hypothesis
Traditional keyword search often fails to return relevant results when user queries contain ambiguity, abstraction, or metaphor. This interface evaluates whether vector-based semantic retrieval can improve result relevance by measuring conceptual similarity rather than relying on exact term matching.

Process Overview (All Open Source)
Metadata → Embedding via Ollama → Vector Search (Qdrant) → Ranked Result Presentation

Data Sources

Primary Sources:
• The Met Collection – Includes ~3100 example records. API lacks narrative object descriptions; embeddings based on short-form metadata only.
• Smithsonian Open Access – Includes ~4700 records from Smithsonian Asian Art (Freer|Sackler). Records often lack narrative descriptions and detailed classifications, which may affect vector relevance.
• Harvard Art Museums – Includes ~8,600 records filtered to Paintings | Sculpture | Prints | Drawings. Objects without a valid image are excluded from the search index to reduce noise from placeholder thumbnails.

Search

Search Mode:

Sends your query to each museum’s vector index individually and merges results. Scores are the same as unified search, but results may be slower.

Example searches using conceptual phrases:
“ritual object from burial”, “duality of myth”, “portrait of grief”, “symbols of migration”, “dreamlike landscapes”, “rebirth and nature”
These compound queries highlight the strength of vector search over traditional keyword matching.

Your search results will appear below.