Reduce LLM Hallucinations by 80%

Cut hallucinations for RAG workflows with just 1 line of code using our semantic filter.

What was our total amount of lease payments as of the end of 2023?

Before

To calculate the total amount of lease payments as of the end of 2023, we need to consider the maturities of all lease liabilities, as there is no exact mention of the figure in the sources provided.

After

$313 million was the total amount of lease payments at the end of 2023

A 3 step agent with a hallucination rate of 15% vs 3% results in an effectiveness of 61% vs 91%

Take Vector Search to the Next Level

Pongo utilizes multiple state of art semantic similarity models along with our own proprietary ranking algorithm to ensure you always get the right info.

Benchmarks

Try it out

Works with your Existing Pipeline

Pongo sits right on top of your existing pipeline, whether you use a vector database or elasticsearch. Just send us your top 100-200 search results and we’ll return the relevant results.

Read The Docs

Production Ready

Lightning Fast

Our distributed architecture ensures consistent latency whether you run 100 or 1,000,000 requests a day.

Zero Data Retention

Pongo only operates at runtime. No data from your queries is stored, and no data leaves our AWS VPC.

Pricing

Develop

Free

500 free queries / mo
We'll work with you to integrate Pongo

Deploy

$60 / mo

60K queries / mo
Standard compute
$8 per addtl. 10k queries

Lightning

$250 / mo

350K queries / mo
60% faster compute
$12 per addtl. 10k queries

Enterprise

Custom

Optional BYOC Deployment
Custom Models
99.99% Uptime SLA

FAQ

How does Pongo work?

We use a collection of different types of models and retrieval methods in conjunction with one another, combining results together to come up with a final score for each document.

This ensures we avoid hallucinations or shortcomings of any single retrieval method and return the most relevant results every time.

Can I Self Host Pongo?

Yes, Pongo can be deployed in a VPC. Just book a call with us, and we'll find the best option for you.

What is Pongo's latency?

Deploy tier is 600-650 ms for 100 documents of 512 tokens vs 350-400ms on the Lightning tier. By default requests are routed to US-West-2 in Oregon, please contact us if you need deployments in another region.

Is Pongo secure?

Yes, Pongo only operates at runtime. We store 0 data, and no data leaves out VPC in AWS. We are in the process of getting SOC2 compliance.