NVIDIA Tensorrt Inference Server

NVIDIA Launches Inference Platforms for Large Language Models and Generative AI Workloads

SANTA CLARA, Calif., March 21, 2023 (GLOBE NEWSWIRE) -- GTC -- NVIDIA today launched four inference platforms optimized for a diverse set of rapidly emerging generative AI applications — helping ...

TweakTown

NVIDIA's new Hopper H200 AI GPU tested: 3x faster GenAI with TensorRT-LLM in MLPerf 4.0 results

Using these new TensorRT-LLM optimizations, NVIDIA has pulled out a huge 2.4x performance leap with its current H100 AI GPU in MLPerf Inference 3.1 to 4.0 with GPT-J tests using an offline scenario.

SDxCentral

Nvidia sets benchmarking performance records with its H200 and TensorRT-LLM software

Nvidia has set new MLPerf performance benchmarking records on its H200 Tensor Core GPU and TensorRT-LLM software. MLPerf Inference is a benchmarking suite that measures inference performance across ...

TechCrunch

Nvidia launches NIM to make it smoother to deploy AI models into production

At its GTC conference, Nvidia today announced Nvidia NIM, a new software platform designed to streamline the deployment of custom and pre-trained AI models into production environments. NIM takes the ...

InfoWorld

Copy-paste vulnerability hits AI inference frameworks at Meta, Nvidia, and Microsoft

Flaws replicated from Meta’s Llama Stack to Nvidia TensorRT-LLM, vLLM, SGLang, and others, exposing enterprise AI stacks to systemic risk. Cybersecurity researchers have uncovered a chain of critical ...

BGR

NVIDIA Is Helping Apple Build A Faster And Better AI Experience

Apple and NVIDIA shared details of a collaboration to improve the performance of LLMs with a new text generation technique for AI. Cupertino writes: Accelerating LLM inference is an important ML ...

SiliconANGLE

Nvidia’s new microservices APIs promise to speed up AI development

Nvidia Corp. said today it’s adding a new, microservices-based layer to its popular Nvidia AI Enterprise platform, giving generative artificial intelligence model developers, platform providers and ...

SiliconANGLE

Nvidia and AWS team up to accelerate AI deployments in the cloud

Nvidia Corp. is doubling down on its partnership with Amazon Web Services Inc. to expand what’s possible in the realms of artificial intelligence, robotics and quantum computing development. The two ...

Network World

Nvidia claims 10x cost savings with open-source inference models

Nvidia has released analysis showing a 4X to 10X reduction in cost per token for AI inferencing by switching to open source models. The cost discounts required combining Blackwell hardware with two ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results