NOSIBLE World

About

NOSIBLE World started from a single observation: the news industry solved the classification problem decades ago with a handful of flat topic taxonomies. It never went back to ask whether those taxonomies were actually good enough for the analysts, quants, researchers, and journalists who rely on them every day. They weren't. They still aren't.

The thesis

Every news event has multiple simultaneous coordinates. The same earnings call is a Technology sector story (GICS), a corporate action (NOSIBLE Corporate Events), an Analysis piece (IPTC Genre), an economic-frame article (Media Frames), and — depending on the quarter — a Joy or Surprise in tone (Ekman). A single taxonomy captures exactly one of those dimensions and ignores the rest.

Legacy news taxonomies like IPTC Media Topics were designed for desks, not for analysis. They solve the routing problem (get this article to the right editor) without solving the research problem (find every event with a particular combination of sector, geography, political action type, and emotional register across the last three years). NOSIBLE World was built to solve the research problem.

The platform classifies every event across 13 complementary ontologies simultaneously — topical, sectoral, political, cyber, health, sports, cultural, framing, emotional, provenance, and more. The full list and reasoning is in the Methodology. Each ontology answers a different question. Together they give analysts the ability to pivot between coordinates — from country to sector to political action type — and surface the events that no single-axis system would show.

The classifier is not a monolithic LLM doing everything. It is a multilingual OpenAI embedding classifier - one text-embedding-3-large vector over the event title and description, scored by cosine similarity against taxonomy caches. The top-ranked label per taxonomy ships. There is no LLM in the classification path.

The 13 ontologies (v1)

A quick map of what each classification layer answers:

Topical spine
IPTC Media Topics
~1,200 codes
Document type
IPTC Genre
~30 newsroom types
Sectors
GICS Sub-Industry
158 sub-industries
Corporate events
NOSIBLE Corporate Events
~80 action types
Political events
PLOVER Event Types
~40 verb types
Disaster events
EM-DAT Disasters
~25 sub-types
Cyber events
MITRE ATT&CK
14 tactics, ~200 techniques
Health events
ICD-11 Chapters
~150 chapters/blocks
Sports events
IPTC SportsML
~100 sport/format combos
Cultural events
schema.org Event
~30 event types
Framing
Media Frames
15 policy frames
Emotion
Ekman 6 Emotions
6 universal categories
Ad / brand-safety
IAB Content Taxonomy
~700 codes, 4 tiers

Two more ontologies — Wikidata QIDs and C2PA Content Credentials — are staged for v1.1. Full methodology →

The technology

NOSIBLE's search engine continuously ingests millions of articles from the open web across every major language and writes pre-built daily snapshots. NOSIBLE World consumes those snapshots through a nine-stage pipeline: ingest, cluster with Leiden community detection, same-incident deduplication, entity-aware splitting via spaCy NER, cross-lingual merging, cross-day story chaining, single-call LLM enrichment (one Gemini call per event, not per article), and multi-ontology classification via the embedding ensemble described above.

Events carry a deterministic identifier derived from their date, IAB category, geography, and topic code — reproducible from the event record alone, collision-resistant across 20+ years of events, stable under re-runs of the same day's data. Every artefact is written atomically via tmp-then-rename so the front end never reads a partial file.

The data model is public — the full specification is committed to this repository as DATA_MODEL.md. No proprietary encoding, no undocumented fields. Analysts who want to query the raw data programmatically can use the enterprise API or contact us about a direct feed.

Team

NOSIBLE World is built by a small team. The platform was designed and built in-house from the ground up — no off-the-shelf event feeds, no third-party classification layer.

  • Stuart ReidFounder

    Designed the multi-ontology classification architecture, data model, and product strategy. Based in [location].

  • Engineering teamPlatform, pipeline, frontend

    The core pipeline, embedding ensemble, and this interface were built by a small in-house engineering team. Hiring — reach out if you are interested.

We are actively looking for engineers who care about information infrastructure, multilingual NLP, and financial data. Email careers@nosible.com.

Press

We have not yet pursued press coverage — NOSIBLE World is still in early access. If you are a journalist or analyst who would like to write about the platform, please see the press kit or email press@nosible.com.

Coverage placeholder

Press mentions will appear here. Reach out to be first.

Contact

Enterprise / data
enterprise@nosible.com