The model is one component. The infrastructure is what compounds. octa squared combines six algorithms running in parallel: online experimentation, ensemble signal ranking, multi-model rotation, sequence policy learning, model distillation, and the macro analysis loop.
Online experimentation Ensemble ranking Model distillation Macro analysis Multi-model rotation Sequence policy Most AI for sales pitches imply the model keeps learning in real time. It does not. Model weights change when training runs land. What changes continuously is the orchestration around the model.
octa retrains on a release cadence: continued pre-training plus long-horizon RL on the corpus. Weights change weekly to monthly, not per campaign.
Online experimentation, ensemble signal ranking, multi-model rotation, sequence policy learning, model distillation, and the macro analysis loop.
Every send, reply, meeting, landing page click, and deal stage transition becomes validated learning per segment.
Validated learnings flow back into the next octa pretraining and fine-tune pass. The process feeds the model. The model feeds the process.
The systems that defined search, research, and game-playing were infrastructures combining many loops, with outcomes feeding back into the inputs.
Hundreds of weak signals are ranked continuously. No single signal decides. The ranking infrastructure, not any single signal, is the moat.
Search proposes moves, self-play generates new training positions, and the network learns from outcomes. The combination is what learns.
Variants compete for traffic. Outcomes are measured. Winners get more allocation per segment. The platform itself is the learner.
Six GTM algorithms run around the octa model. Each loop validates inputs for the others. Validated learnings flow back into the corpus.
The combination is the point: online experimentation tunes variables, ensemble ranking decides priority, multi-model rotation picks output, sequence policy decides the next step, distillation drops cost, and macro analysis watches the system.
graph8 runs continuous controlled experiments on campaign variables: subject lines, send times, cadence depth, channel mix, landing page layout, and voice openers.
Accounts, contacts, intent signals, and sequence variants are scored by combining many weak signals into one ranked output. Outcomes re-weight the signals.
Generator, critic, and editor roles rotate across models per task. The winning output ships. The competition log feeds training.
Each touch is a state. From each state there are many possible next actions. octa squared updates the state-to-action policy from real outcomes.
Frontier models generate canonical exemplars for hard GTM tasks. A retrieval pool serves those exemplars to cheaper open source students.
Weekly and monthly reports span capacity, customer outcomes, algorithm performance, and segment shifts so humans can review and course-correct.
Six loops, but the loops are connected. Validated learnings from one loop become the starting material for another.
The day-to-day loop is the infrastructure. Validated learnings, winning exemplars, surviving sequence policies, and re-tuned rankings flow back into the corpus that retrains octa.
The model release cadence is weekly to monthly. The infrastructure cadence is hourly. The two cadences feed each other.
Winning exemplars
Variable locks
Sequence policies
Distilled pool
The five steps that turn yesterday's campaigns into next week's model.
Every campaign outcome, winning variant, state-to-action transition, and distilled exemplar lands in the corpus.
The infrastructure tags segment, variable, channel, and intent without humans in the hot path. Humans review aggregates.
Held-out replay on octa Bench. If the new exemplar would have won historical campaigns, it survives.
Surviving exemplars enter the next octa continued pre-training and long-horizon RL pass.
New octa weights deploy. The six algorithms now run with a sharper component. The next loop starts.
Live across every graph8 customer org. Every loop runs on its own cadence. The combination is what compounds.
6 algorithms
Running in production today across every graph8 customer org.
Hourly
The cadence where experiment results, rankings, and routing decisions update.
Weekly
The cadence where macro analysis re-tunes the infrastructure.
Execution credits send, call, and run agents. Contact credits reveal and enrich the people you target. Three ways in.
A free entry point to the full platform. No credit card.
Unlimited contact data, in the tools you already use.
Unlimited users and data. Every app, one bill.
The $99 Team plan is the full platform: everything below, for the whole org. MCP Unlimited and Pay as you go are lighter entry points with usage and seat limits.
Execution credits send, call, and run agents. Contact credits reveal and enrich the people you target.
See which variables lock, which model wins each task, which signal gets ranked up, which sequence policy fires, and what the next loop ships.