After benchmarking the R1 1776 model and seeing how post training influenced its performance (full post here), I realized another gap.
Models that can technically handle a huge context window often degrade long
Purpose
This benchmark measures the real-world inference performance of Perplexity AI’s R1 1776 model, a post-trained version of DeepSeek R1 671B, designed to eliminate censorship and enhance unbiased information delivery under controlled
People have been having conversations for thousands of years. We’re wired for it. But we’re not wired for talking to something that doesn’t understand social cues, the subtle, unspoken signals
For years, I’ve been fascinated by AI assistants. They are useful, sure, but they always seem to be missing something. We all want a JARVIS from Iron Man or the computer from