Together AI
inference_host
155 signals tracked
Together AI Launches Kimi K2 (1T params) on Together AI
Run Kimi K2 (1T params) on Together AI—frontier open model for agentic reasoning and coding, serverless deployment, 99.9% SLA, lower cost and instant scaling.
Date not specified
HighCapabilityOpenAI launches FutureBench: AI agent benchmark for forecasting real-world events
FutureBench is a live, leak-free benchmark of true reasoning—AI agents forecast real-world events (rates, geopolitics) before they happen.
Date not specified
HighCapabilityTogether AI delivers industry-leading inference speeds for DeepSeek-R1-0528 on NVIDIA Blackwell
Together AI inference is now among the world’s fastest, most capable platforms for running open-source reasoning models like DeepSeek-R1 at scale, thanks to our new inference engine designed for NVIDIA HGX B200.
Date not specified
HighCapabilityTogether AI launches Qwen3-Coder: 480B agentic coding model now available
Unlock agentic coding with Qwen3-Coder on Together AI: 256K context, SWE-bench rivaling Claude Sonnet 4, zero-setup instant deployment.
Date not specified
HighCapabilityTogether Evaluations: Benchmark LLMs with LLM Judges
Together Evaluations is a flexible framework for benchmarking LLMs using strong open-source models as judges. Skip manual labeling and rigid metrics—get fast, customizable insights into model quality for your specific tasks.
Date not specified
HighCapabilityVirtueGuard AI Security Now on Together AI
Date not specified
HighCapabilityOpenAI releases gpt-oss-120B and gpt-oss-20B on Together AI
Access OpenAI’s gpt-oss-120B on Together AI: Apache-2.0 open-weight model with serverless & dedicated endpoints, $0.50/1M in, $1.50/1M out, 99.9% SLA.
Date not specified
CriticalCapabilityOpenAI releases gpt-oss models: 20B and 120B compete with o4-mini
Date not specified
HighCapabilityParsed achieves 60% better performance with open-source LLM for healthcare scribing
Parsed fine-tuned a 27B open-source model to beat Claude Sonnet 4 by 60% on a real-world healthcare task—while running 10–100x cheaper.
Date not specified
HighCapabilityTogether AI Announces Fine-Tuning for OpenAI gpt-oss Models
Customize OpenAI’s gpt-oss-20B/120B with Together AI’s fine-tuning: train, optimize, and instantly deploy domain experts with enterprise reliability and cost efficiency.
Date not specified
HighCapabilityTogether AI: Automating Engineering Workflows with AI Agents
Build AI agents for complex, long-running engineering tasks. Learn key patterns from a case study: accelerating LLM inference with speculative decoding.
Date not specified
MediumCapabilityTogether AI releases DeepSeek-V3.1: Hybrid Thinking Model Now Available
Access DeepSeek-V3.1 on Together AI: MIT-licensed hybrid model with thinking/non-thinking modes, 66% SWE-bench Verified, serverless deployment, 99.9% SLA.
Date not specified
HighCapabilityTogether AI Launches Instant Clusters with NVIDIA GPUs — self-service GPU clusters available now
Together AI launches Instant Clusters: self-service GPU clusters with NVIDIA H100/B200, ready in minutes for training or inference at any scale.
Date not specified
HighCapabilityTogether AI Fine-Tuning Platform Upgrades: Larger Models & Longer Contexts
Together AI expands Fine-Tuning Platform: train 100B+ models, extend context lengths, integrate with Hugging Face Hub, and access new DPO options.
Date not specified
HighCapabilityTogether AI hires Mahadev Konar as SVP for Infrastructure Engineering
Hiring Mahadev Konar further deepens Together AI’s commitment to deliver the most reliable and scalable GPU infrastructure.
Date not specified
InfoCapabilityBatch Inference API: Rate Limit Increase & New Features
Our new Batch Inference API makes large-scale AI workloads simpler, faster, and cheaper. With a streamlined UI, universal model support, and 3000× higher rate limits—now up to 30B tokens—you can process massive datasets at half the cost of real-time APIs.
Date not specified
HighCapabilityTogether AI Launches Adaptive-Learning Speculator System (ATLAS) for Faster LLM Inference
LLM inference that gets faster as you use it. Our runtime-learning accelerator adapts continuously to your workload, delivering 500 TPS on DeepSeek-V3.1, a 4x speedup over baseline performance without manual tuning.
Date not specified
HighCapabilityTogether AI Launches Startup Accelerator for AI Native Apps
We've launched the Together AI Startup Accelerator: Up to $50K credits, expert engineering hours, GTM support, community and VC access for AI-native apps in build–scale tiers.
Date not specified
InfoCapabilityTogether AI expands model library with 40+ image and video models
Together AI adds 40+ image & video models, including Sora 2 and Veo 3, to build end-to-end multimodal apps with unified OpenAI-compatible APIs and transparent pricing.
Date not specified
HighCapabilityLarge Reasoning Models Fail to Follow Instructions — New Benchmark Reveals Critical Flaw
ReasonIF finds frontier LRMs fail to follow reasoning instructions >75% of the time; introduces a benchmark across languages, formatting, and length.
Date not specified
CriticalCapability
Get alerts for Together AI
Never miss a breaking change. SignalBreak monitors Together AI and dozens of other AI providers in real time.
Sign up free — no credit card required