OpenAI Embeddings API adds contrastive pre-training for text and code embeddings
AI Impact Summary
The capability introduces contrastive pre-training to generate embeddings for both text and code, delivering cross-modal semantic representations. This should improve retrieval quality and code search accuracy, enabling more effective similarity ranking and clustering for mixed queries. Downstream pipelines may need re-indexing and adjustments to embedding model selection, as the new space can shift results and costs. Validate with a pilot to benchmark latency, max token usage, and indexing pipelines against current embeddings.
Affected Systems
Business Impact
Applications relying on embeddings will see improved retrieval and code search quality, but must re-index data and adapt pipelines to the new embedding space, as dimensions or latency may differ.
- Date
- Date not specified
- Change type
- capability
- Severity
- medium