MediumCapability

OpenAI Embeddings API adds contrastive pre-training for text and code embeddings

AI Impact Summary

The capability introduces contrastive pre-training to generate embeddings for both text and code, delivering cross-modal semantic representations. This should improve retrieval quality and code search accuracy, enabling more effective similarity ranking and clustering for mixed queries. Downstream pipelines may need re-indexing and adjustments to embedding model selection, as the new space can shift results and costs. Validate with a pilot to benchmark latency, max token usage, and indexing pipelines against current embeddings.

Affected Systems

OpenAI Embeddings API

Business Impact

Applications relying on embeddings will see improved retrieval and code search quality, but must re-index data and adapt pipelines to the new embedding space, as dimensions or latency may differ.

Date: Date not specified
Change type: capability
Severity: medium

OpenAI Embeddings API adds contrastive pre-training for text and code embeddings

More from OpenAI

Get alerts for OpenAI