Why vanity metrics are actively misleading your GEO program
Here is the pattern I see constantly. A team starts tracking AI mentions. The number goes up. They report it as progress. Leadership nods. Then nothing changes in pipeline quality, nobody can explain why, and the GEO program gets quietly deprioritized six months later.
Raw mention growth can look positive while commercial outcomes worsen. Mention volume does not tell you whether the model recommends you for the right use cases, positions you favorably against competitors, or describes you accurately on the claims that matter for buyer trust. A brand that is mentioned in 60% of prompts but framed as 'a budget alternative with limited enterprise features' in all of them is not winning. They are losing with high visibility.
A decision-ready scorecard must include presence quality (not just presence), framing accuracy (are the descriptions correct and favorable?), factual reliability (how often does the model get your facts wrong?), and risk concentration by intent (where are the errors happening in the buying journey?). The goal is to measure influence on buying behavior, not content activity.