docsCore ConceptsSemantic Search

Semantic Search

Lorn AI uses semantic search to understand the meaning behind product queries, not just match keywords. This enables natural language shopping experiences where users can describe what they want in their own words.


How It Works

Traditional search matches exact keywords. Semantic search understands intent:

QueryKeyword SearchSemantic Search
”something warm for winter”❌ No matches✅ Jackets, sweaters, coats
”gift for a runner”❌ “gift” not in products✅ Running shoes, fitness trackers
”comfortable WFH outfit”❌ No “WFH” products✅ Loungewear, athleisure
”eco-friendly water bottle”⚠️ Only if exact match✅ Reusable bottles, sustainable products

The Technology Stack

┌─────────────────────────────────────────────────────────────────────────┐
│                           Semantic Search Pipeline                       │
│                                                                         │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌──────────┐ │
│  │             │    │             │    │             │    │          │ │
│  │   Query     │───▶│   OpenAI    │───▶│  pgvector   │───▶│ Ranked   │ │
│  │   "warm     │    │  Embedding  │    │  Similarity │    │ Results  │ │
│  │   jacket"   │    │             │    │   Search    │    │          │ │
│  │             │    │             │    │             │    │          │ │
│  └─────────────┘    └─────────────┘    └─────────────┘    └──────────┘ │
│                                                                         │
│       User's              Vector              Find nearest              │
│       natural             [0.12, -0.34,       neighbors in              │
│       language            0.56, ...]          product vectors           │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
  1. User Query → Natural language input
  2. OpenAI Embedding → Convert to 1536-dimensional vector using text-embedding-3-small
  3. pgvector Search → Find products with similar embedding vectors
  4. Ranked Results → Return products sorted by similarity score

Basic Query

curl "https://{{YOUR_STORE_URL}}/acp/products?q=comfortable+running+shoes" \
  -H "X-ACP-API-Key: {{YOUR_API_KEY}}"

Response with Similarity Scores

{
  "items": [
    {
      "id": "prod_nike_pegasus",
      "title": "Nike Air Zoom Pegasus 40",
      "description": "Responsive cushioning for a smooth, comfortable ride",
      "price": 129.99,
      "similarity_score": 0.92
    },
    {
      "id": "prod_brooks_ghost",
      "title": "Brooks Ghost 15",
      "description": "Soft cushioning and smooth transitions",
      "price": 139.99,
      "similarity_score": 0.89
    },
    {
      "id": "prod_asics_gel",
      "title": "ASICS Gel-Nimbus 25",
      "description": "Maximum cushioning for long-distance comfort",
      "price": 159.99,
      "similarity_score": 0.87
    }
  ],
  "page": 1,
  "page_size": 10,
  "total": 3
}

Understanding Similarity Scores

Score RangeInterpretation
0.90 - 1.00Excellent match
0.80 - 0.89Good match
0.70 - 0.79Moderate match
0.60 - 0.69Weak match
< 0.60Poor match

Query Types That Work Well

Descriptive Queries

Describe what you’re looking for:

# What the product does
?q=headphones+that+block+outside+noise
 
# How it should feel
?q=soft+comfortable+sweater+for+winter
 
# Use case
?q=laptop+bag+for+daily+commute

Intent-Based Queries

Describe the purpose:

# Gift shopping
?q=birthday+gift+for+teenager
 
# Occasion-based
?q=outfit+for+job+interview
 
# Activity-based
?q=gear+for+camping+trip

Problem-Solution Queries

Describe the problem you’re solving:

# Pain point
?q=back+support+for+office+chair
 
# Goal
?q=help+me+sleep+better
 
# Comparison
?q=alternative+to+airpods

Combining Semantic Search with Filters

Semantic search works with traditional filters:

Price Range

curl "https://{{YOUR_STORE_URL}}/acp/products?q=wireless+earbuds&min_price=50&max_price=150" \
  -H "X-ACP-API-Key: {{YOUR_API_KEY}}"

Category Filter

curl "https://{{YOUR_STORE_URL}}/acp/products?q=lightweight+running&category=Footwear" \
  -H "X-ACP-API-Key: {{YOUR_API_KEY}}"

Tag Filter

curl "https://{{YOUR_STORE_URL}}/acp/products?q=warm+jacket&tag=sale" \
  -H "X-ACP-API-Key: {{YOUR_API_KEY}}"

Combined Filters

curl "https://{{YOUR_STORE_URL}}/acp/products?q=professional+laptop+bag&category=Accessories&min_price=50&max_price=200" \
  -H "X-ACP-API-Key: {{YOUR_API_KEY}}"

How Products Are Indexed

Embedding Generation

When products are added to the catalog, embeddings are generated from:

Embedding Input = Title + Description + Category + Tags

Example:

"Nike Air Zoom Pegasus 40 - Responsive cushioning meets a breathable 
upper for a comfortable ride mile after mile. Footwear > Running. 
Tags: running, athletic, cushioned, lightweight"

This text is converted to a 1536-dimensional vector that captures the product’s semantic meaning.

Database Storage

-- Products table with vector column
CREATE TABLE products (
    id TEXT PRIMARY KEY,
    title TEXT,
    description TEXT,
    embedding VECTOR(1536),  -- pgvector column
    ...
);
 
-- Similarity search query
SELECT *, 1 - (embedding <=> query_embedding) AS similarity
FROM products
ORDER BY embedding <=> query_embedding
LIMIT 10;

Best Practices for Product Data

Rich Descriptions

Good:

{
  "title": "Nike Air Zoom Pegasus 40",
  "description": "The Nike Air Zoom Pegasus 40 delivers responsive cushioning and a breathable upper for a comfortable ride. Perfect for daily training runs, featuring Zoom Air units in the forefoot for energy return."
}

Bad:

{
  "title": "Nike Pegasus",
  "description": "Running shoe"
}

Descriptive Tags

Include tags that match how users search:

{
  "tags": [
    "running",
    "athletic",
    "cushioned",
    "daily trainer",
    "neutral",
    "lightweight",
    "breathable"
  ]
}

Structured Categories

Use hierarchical categories:

{
  "category": "Footwear > Running > Neutral"
}

When q is provided, semantic search is used. Without q, filter-only search applies:

Semantic Search (with q)

# Uses embeddings to find meaning
GET /acp/products?q=warm+winter+jacket
  • Matches based on meaning
  • Returns similarity scores
  • Best for natural language queries

Filter-Only Search (without q)

# Uses exact filters only
GET /acp/products?category=Outerwear&tag=winter
  • Matches based on exact field values
  • No similarity scoring
  • Best for browsing/filtering

Performance Considerations

Query Latency

ComponentTypical Latency
OpenAI embedding50-100ms
pgvector search10-50ms
Total60-150ms

Optimization Tips

  1. Use filters to narrow scope

    # Faster: filter + semantic
    ?q=running+shoes&category=Footwear
     
    # Slower: semantic only across all products
    ?q=running+shoes
  2. Limit result size

    ?q=headphones&page_size=10
  3. Cache common queries — Implement caching for frequently searched terms


Troubleshooting

No Results Returned

Cause: Query too specific or no matching products

Solution:

  • Broaden the query
  • Remove restrictive filters
  • Check that products exist in the catalog

Poor Relevance

Cause: Product descriptions lack detail

Solution:

  • Enrich product descriptions
  • Add relevant tags
  • Include use-case information

Unexpected Results

Cause: Semantic matching found related but different products

Solution:

  • Add category/tag filters to constrain results
  • Make query more specific

Examples by Use Case

Shopping Assistant

def find_products(user_message):
    """Convert user message to semantic search."""
    response = requests.get(
        f"{BASE_URL}/acp/products",
        params={"q": user_message, "page_size": 5},
        headers=HEADERS
    )
    return response.json()["items"]
 
# User: "I need something for my morning jog"
products = find_products("comfortable shoes for morning jog")

Gift Finder

# Find gifts by recipient and occasion
products = search_products(
    q="gift for dad who loves golf",
    max_price=100
)

Comparison Shopping

# Find alternatives to a known product
products = search_products(
    q="wireless noise canceling headphones like Sony WH-1000XM5"
)

API Reference

Search Endpoint

GET /acp/products
GET /acp/products/search  (alias)

Parameters

ParameterTypeDescription
qstringNatural language search query
categorystringFilter by category
tagstringFilter by tag
min_pricenumberMinimum price
max_pricenumberMaximum price
pageintegerPage number (default: 1)
page_sizeintegerResults per page (default: 10, max: 100)

Response

{
  "items": [
    {
      "id": "string",
      "title": "string",
      "description": "string",
      "price": 0.00,
      "similarity_score": 0.00,
      ...
    }
  ],
  "page": 1,
  "page_size": 10,
  "total": 0,
  "version": "supabase-semantic",
  "generated_at": "2024-01-15T10:30:00Z"
}

Next Steps