NOSIBLE World

About

NOSIBLE World started from a single observation: the news industry solved the classification problem decades ago with a handful of flat topic taxonomies. It never went back to ask whether those taxonomies were actually good enough for the analysts, quants, researchers, and journalists who rely on them every day. They weren't. They still aren't.

The thesis

Every news event has multiple simultaneous coordinates. The same earnings call is a Technology sector story (GICS), a corporate action (NOSIBLE Corporate Events), an Analysis piece (IPTC Genre), an economic-frame article (Media Frames), and — depending on the quarter — a Joy or Surprise in tone (Ekman). A single taxonomy captures exactly one of those dimensions and ignores the rest.

Legacy news taxonomies like IPTC Media Topics were designed for desks, not for analysis. They solve the routing problem (get this article to the right editor) without solving the research problem (find every event with a particular combination of sector, geography, political action type, and emotional register across the last three years). NOSIBLE World was built to solve the research problem.

The platform classifies every event across 13 complementary ontologies simultaneously — topical, sectoral, political, cyber, health, sports, cultural, framing, emotional, provenance, and more. The full list and reasoning is in the Methodology. Each ontology answers a different question. Together they give analysts the ability to pivot between coordinates — from country to sector to political action type — and surface the events that no single-axis system would show.

The classifier is not a monolithic LLM doing everything. It is a multilingual OpenAI embedding classifier - one text-embedding-3-large vector over the event title and description, scored by cosine similarity against taxonomy caches. The top-ranked label per taxonomy ships. There is no LLM in the classification path.

The 13 ontologies (v1)

A quick map of what each classification layer answers:

Topical spine

IPTC Media Topics

~1,200 codes

Document type

IPTC Genre

~30 newsroom types

Sectors

GICS Sub-Industry

158 sub-industries

Corporate events

NOSIBLE Corporate Events

~80 action types

Political events

PLOVER Event Types

~40 verb types

Disaster events

EM-DAT Disasters

~25 sub-types

Cyber events

MITRE ATT&CK

14 tactics, ~200 techniques

Health events

ICD-11 Chapters

~150 chapters/blocks

Sports events

IPTC SportsML

~100 sport/format combos

Cultural events

schema.org Event

~30 event types

Framing

Media Frames

15 policy frames

Emotion

Ekman 6 Emotions

6 universal categories

Ad / brand-safety

IAB Content Taxonomy

~700 codes, 4 tiers

Two more ontologies — Wikidata QIDs and C2PA Content Credentials — are staged for v1.1. Full methodology →

The technology

NOSIBLE's search engine continuously ingests millions of articles from the open web across every major language and writes pre-built daily snapshots. NOSIBLE World consumes those snapshots through a nine-stage pipeline: ingest, cluster with Leiden community detection, same-incident deduplication, entity-aware splitting via spaCy NER, cross-lingual merging, cross-day story chaining, single-call LLM enrichment (one Gemini call per event, not per article), and multi-ontology classification via the embedding ensemble described above.

Events carry a deterministic identifier derived from their date, IAB category, geography, and topic code — reproducible from the event record alone, collision-resistant across 20+ years of events, stable under re-runs of the same day's data. Every artefact is written atomically via tmp-then-rename so the front end never reads a partial file.

The data model is public — the full specification is committed to this repository as DATA_MODEL.md. No proprietary encoding, no undocumented fields. Analysts who want to query the raw data programmatically can use the enterprise API or contact us about a direct feed.

Team

NOSIBLE World is built by a small team. The platform was designed and built in-house from the ground up — no off-the-shelf event feeds, no third-party classification layer.

Stuart ReidFounder
Designed the multi-ontology classification architecture, data model, and product strategy. Based in [location].
Engineering teamPlatform, pipeline, frontend
The core pipeline, embedding ensemble, and this interface were built by a small in-house engineering team. Hiring — reach out if you are interested.

We are actively looking for engineers who care about information infrastructure, multilingual NLP, and financial data. Email careers@nosible.com.

Press

We have not yet pursued press coverage — NOSIBLE World is still in early access. If you are a journalist or analyst who would like to write about the platform, please see the press kit or email press@nosible.com.

Coverage placeholder

Press mentions will appear here. Reach out to be first.

Contact

General: hello@nosible.com
Press: press@nosible.com
Enterprise / data: enterprise@nosible.com
Careers: careers@nosible.com