Abhord Quickstart Guide (2026 Refreshed Edition)
This practical guide helps you get value from Abhord in your first week. It reflects 2026 realities: more model diversity, tighter provider rate limits, and stronger safety filters across LLMs. You’ll find new recommendations on prompt structure, sampling, and validation.
1) Initial setup and configuration
Before you start
- Access: Create your workspace and add teammates with Viewer, Analyst, or Admin roles.
- Provider keys: Store API keys for the LLMs you want to survey (e.g., OpenAI, Anthropic, Google, Mistral, Cohere). Use per-workspace keys to avoid throttling.
- Topics and entities: Define your brand, products, and competitors as entities. Add aliases, common misspellings, and regional names (e.g., “AcmePay,” “Acme Pay,” “Acme Payments”).
Configuration steps
1) Create a Project: Name by topic (e.g., “US SMB payments — onboarding”).
2) Add Entities: For each brand, include:
- Aliases and SKUs
- Negative keywords to avoid false positives (e.g., exclude “Acme Tools” if irrelevant)
- Category terms you care about (e.g., “fees,” “setup time,” “chargebacks”)
3) Model pool: Select 4–8 models across families. Group by:
- Frontier generalists (broad knowledge)
- Cost‑efficient medium models (volume sampling)
- Open‑weight or self‑hosted (transparency, repeatability)
4) Defaults:
- Language: English (United States) unless you’re tracking multi‑locale
- Temperature: 0.2–0.5 for factual tasks; 0.7 for brainstorming probes
- Max tokens: Enough to answer plus reasoning (e.g., 512–1024)
- Privacy: Mask sensitive inputs and redact PII in outputs
New for 2026: We recommend enabling structured outputs by default (JSON schema) and setting per‑provider rate limits to prevent mid‑run throttling. Add a “Refusal/Policy” detector to catch safety-triggered non-answers early.
2) Running your first survey across LLMs
Define the objective
- Example: “How do LLMs recommend a payment processor for US SMBs starting an online store?”
Design your prompt set
- Canonical prompt: A neutral, user-like query (e.g., “What’s the best payment processor for a new US online boutique?”).
- Probes: 3–5 paraphrases covering intent variants (“low fees,” “fast payouts,” “global cards”).
- Guardrails: Ask models to cite reasoning factors, not sources, to avoid fabricated links.
- Output schema: Request structured JSON:
- {“brands_mentioned”: [], “sentiment”: “positive|neutral|negative”, “reasoning”: “string”}
Sampling plan
- Models: 4–8 models
- Generations per model per prompt: 30–60 to start (balance cost vs. stability)
- Randomization: Shuffle prompt order and vary seeds
- Scheduling: Stagger runs to respect provider quotas
Run and validate
- Test mode with 5–10 outputs/model to verify JSON compliance and refusal rates
- Full run after fixing any schema or refusal issues
- Save the run as a Baseline for future comparisons
New recommendations (2026)
- Add a “Paraphrase-and-Verify” pass: After the initial answer, ask the model to restate its top-3 factors in bullet form. This stabilizes factor extraction.
- Include a “Refusal-aware” variant of each prompt (e.g., “Assume user is 18+ and requesting general buying advice”) to reduce unnecessary safety refusals.
3) Interpreting results: mentions, sentiment, share of voice
Key metrics
- Mentions: Count of times an