Three agencies track maritime piracy. They don't agree on the numbers.

Our dashboard currently shows 1 piracy incident in the last 7 days. That number comes from the IMB Live Piracy Map — a 3.9MB JSON payload from the ICC International Maritime Bureau, updated daily, containing every reported incident going back years.

But "1 incident in 7 days" is the IMB's count. ReCAAP ISC, which covers Asia-Pacific waters, might have a different number for the same week. UKMTO, which monitors the Indian Ocean and Gulf of Aden, would have its own. And all three define "piracy incident" slightly differently.

This isn't a niche data-quality complaint. If you're an underwriter setting a war risk premium based on piracy frequency, the source you check determines the number you see. And the number you see determines the rate you charge.

The three agencies

There are other organizations that track maritime crime — MDAT-GoG for the Gulf of Guinea, various national navies, commercial AIS analytics firms. But these three are the ones that most underwriting desks and fleet managers actually reference. Here's how they differ.

Agency	Coverage	Format	Update cadence	Access
IMB ICC	Global	Live map (JSON API)	Daily	Public
ReCAAP ISC	Asia-Pacific	PDF reports	Weekly	Public
UKMTO	Indian Ocean, Red Sea, Gulf of Aden	Advisory bulletins	As-needed	Blocked

Already you can see the first problem. Coverage overlaps for Southeast Asia (both IMB and ReCAAP cover Malacca Strait) and for the Indian Ocean (both IMB and UKMTO). But the Gulf of Guinea gets only IMB. The Caribbean gets only IMB. And if you're relying solely on ReCAAP for your Asia-Pacific picture, you're missing everything west of longitude 95°E.

What IMB gives you (and what it doesn't)

The IMB Live Piracy Map is the closest thing to a global piracy data feed. It's backed by a WordPress API (specifically WP Go Maps) that returns marker data with coordinates, category IDs, and sitrep text.

Category IDs map to five types:

Attempted

Boarded

Fired upon

Hijacked

Suspicious

The data is in the sitrep text, which follows a format like DD.MM.YYYY: HHMM UTC followed by a narrative. Vessel names, IMO numbers, positions — they're all embedded in free text, not structured fields. Parsing this requires regex extraction and inference. "Boarded" in IMB typically means armed robbery in the IMO's classification. "Fired upon" could be armed robbery or piracy depending on location (territorial waters vs. high seas).

The response is also large — roughly 3.9MB — because it contains every marker the map has ever displayed. Fetching it reliably requires a 30-second timeout window. We cache it for 1 hour.

The IMO definition split

The IMO distinguishes between piracy (high seas, per UNCLOS Article 101) and armed robbery against ships (territorial waters, per IMO Assembly Resolution A.1025(26)). IMB lumps both under "piracy" in their public data. This matters: an incident in the Singapore Strait anchorage is legally armed robbery, not piracy, even though the IMB map marker looks the same. Underwriters pricing "piracy risk" may be including events that fall outside the legal definition — or excluding events that fall inside it.

ReCAAP: deep regional data, PDF-locked

The Regional Cooperation Agreement on Combating Piracy and Armed Robbery against Ships in Asia covers 21 contracting parties. Their Information Sharing Centre in Singapore publishes detailed incident reports — dates, coordinates in DMS format (DD°MM'N DDD°MM'E), vessel details, classification into four significance levels (CAT 1 through CAT 4).

The data quality is high. The format is a problem. It's PDF.

To get structured data from ReCAAP, we scrape their /reports/ page, download the weekly incident PDFs, extract text from the PDF stream objects (they're text-based, not scanned), and either run the extracted text through an LLM for structured extraction or fall back to regex-based coordinate parsing. This is fragile. PDF layout changes break extraction. Some reports use slightly different coordinate formats. We process up to 10 PDFs per fetch, cache each for 24 hours, and deduplicate against IMB using a proximity threshold: if two incidents from different sources are within 5 nautical miles and 48 hours, same category — they're the same event.

ReCAAP's category system

ReCAAP uses a severity-based classification (CAT 1-4) rather than an incident-type classification. CAT 1 is the most significant (crew violence, weapons used, vessel hijacked). CAT 4 is petty theft. This doesn't map cleanly to IMB's attempted/boarded/fired-upon/hijacked schema. A ReCAAP CAT 3 "boarding" might be IMB "attempted" or "boarded" depending on whether items were actually stolen. We normalize both into a shared taxonomy: HIJACKING, ARMED_ROBBERY, BOARDING, ATTEMPTED, SUSPICIOUS, THEFT, OTHER.

UKMTO: the gap in the data

United Kingdom Maritime Trade Operations runs the voluntary reporting scheme for the Indian Ocean, Arabian Sea, Gulf of Aden, Red Sea, and Bab el-Mandeb. If you're a merchant vessel transiting these waters, UKMTO is who you report suspicious activity to.

The problem: their website returns HTTP 403. Cloudflare protection. No public API. No RSS feed. No structured data export.

ArcNautical voyage scorer showing Fujairah to Djibouti route through the Gulf of Aden with threat overlays including JWC listed areas in red — A Fujairah-to-Djibouti route through the Gulf of Aden — exactly the waters UKMTO monitors. JWC listed areas shown in red. Our piracy scoring for this route relies on IMB data because UKMTO's data is inaccessible to automated systems.

This means the single most relevant reporting authority for one of the highest-risk maritime corridors in the world provides zero structured data to automated systems. The Indian Ocean and Gulf of Aden piracy picture that any platform can build — ours included — is an IMB-only view of a region where UKMTO has better on-the-ground intelligence.

IMB does cover the same geography. Their coverage is global. But UKMTO advisories often include threat assessments and guidance that IMB incident markers don't capture — tactical recommendations, convoy coordination notes, threat pattern analysis. That context is unavailable to anyone who isn't manually reading UKMTO bulletins.

Where the gaps show up

When we built ArcNautical's piracy data pipeline, we wanted to fuse all three sources into a single, deduplicated incident feed. What we actually got was this:

Region	IMB	ReCAAP	UKMTO	Gap
Strait of Malacca	Yes	Yes	No	Best covered. Two overlapping sources.
South China Sea	Yes	Yes	No	Good coverage, but SCS incidents are politically sensitive — some go unreported.
Gulf of Aden	Yes	No	403	IMB only. UKMTO has data but we can't access it.
Gulf of Guinea	Yes	No	No	IMB only. MDAT-GoG exists but no API.
Arabian Sea	Yes	No	403	IMB only. UKMTO coverage area but inaccessible.
Caribbean	Yes	No	No	IMB only. Venezuela/Trinidad incidents likely underreported.
Indian Ocean	Yes	Partial	403	Fragments. ReCAAP covers Bay of Bengal segment only.

The pattern is clear: outside of Southeast Asian waters, the global piracy picture is an IMB-only picture. And while IMB's coverage is genuinely global, they depend on voluntary reporting. Many incidents go unreported — either because the vessel operator doesn't want the attention, because the flag state discourages it, or because the incident was "minor" enough (petty theft at anchorage) that the master didn't bother.

The piracy number you see depends on who you ask, what they define as piracy, and whether anyone reported it in the first place.

The classification problem

Even when two agencies report the same incident, they may classify it differently. Consider a hypothetical event: four men in a speedboat approach a container vessel at anchor in the Singapore Strait. They board, steal mooring equipment from the forecastle, and leave before the crew is alerted.

IMB would likely call this

Category 2: Boarded. The vessel was boarded by unauthorized persons. In our normalization, this becomes ARMED_ROBBERY — even though no weapons were reported. IMB's "boarded" category doesn't distinguish between armed and unarmed boarding.

ReCAAP would likely call this

CAT 4: Petty theft. No weapons, no crew confrontation, minor items stolen. In our normalization, this becomes THEFT. A less severe classification for the identical event.

Both are "correct" within their own frameworks. But if your piracy risk model counts ARMED_ROBBERY events, the IMB feed inflates the count for Singapore Strait anchorages relative to what ReCAAP reports. If your model filters for only CAT 1-2 (significant) ReCAAP events, you'd miss this incident entirely.

We handle this with a shared normalization taxonomy — seven categories that both sources get mapped into, with explicit rules for the ambiguous cases. It's imperfect. A boarded-but-unarmed incident getting classified as ARMED_ROBBERY by one source and THEFT by another is a real problem that no taxonomy fully solves. But at least the normalization makes the disagreement visible rather than hiding it behind different nomenclatures.

What this means for scoring

ArcNautical voyage scorer showing Fujairah to Djibouti route scored at 46 ELEVATED with signal breakdown — A Fujairah-to-Djibouti voyage scoring 46 (ELEVATED). The piracy signal is one of ten that feeds the composite. Its weight depends on how many incidents our spatial query finds within range of the route — and that count depends entirely on which sources have data for that corridor.

When ArcNautical scores a voyage, the piracy signal is one of ten inputs to the composite risk score. The spatial query searches for incidents within a distance threshold of the actual route geometry — not a bounding box, not a port pair, but the polyline the vessel would actually follow through the water.

The count it returns is only as complete as the data in the arcnautical_piracy_incidents table, which gets populated every 6 hours by a cron job that fetches from IMB and ReCAAP (UKMTO returns nothing, as discussed). For a route through the Gulf of Aden, that means the piracy signal reflects IMB's view of the region. For a route through the Malacca Strait, it reflects both IMB and ReCAAP — a more complete picture.

This is why we built the dataConfidence field into every score response. It reports the fraction of our 10 signals that returned real data for a given route. If piracy data is thin for a corridor (say, only 1 incident in 90 days from a single source), the confidence score drops, and the UI shows a warning banner. The system tells you when it's working with incomplete information.

The undercount problem

Industry estimates suggest that actual piracy and armed robbery incidents outnumber reported ones by a factor of 2-5x, depending on the region. West Africa is thought to have the widest gap between real and reported incidents. A risk model built purely on reported data will structurally undercount. We partially compensate with the Country Instability Index (CII) — a signal derived from GDELT conflict event data that captures deteriorating security conditions even when specific incidents go unreported. For countries like Somalia (CII floor: 85) and Yemen (CII floor: 75), the CII acts as a minimum risk level regardless of how few piracy incidents are in the database.

Deduplication: when sources overlap

In Southeast Asian waters where IMB and ReCAAP both report, we need to avoid double-counting. Same incident, two records, two different IDs, possibly different coordinates (one agency rounds to whole minutes, the other reports decimal minutes).

Our deduplication logic: if two incidents from different sources share the same category, occurred within 48 hours of each other, and are within 5 nautical miles — they're the same event. We keep the record with the longer description (more detail). This is a heuristic, not a guarantee. Two genuine incidents 4nm apart on the same night would incorrectly merge. But the alternative — double-counted risk — is worse for scoring accuracy.

ArcNautical dashboard showing piracy incident data on global map with 1 piracy incident in 7 days and 369 active NAVWARNs in the alert ticker — The live dashboard showing 1 piracy incident in the last 7 days across all sources, with 369 active NAVWARNs. The chokepoint sidebar tracks disruption scores for the world's major maritime bottlenecks.

What we'd like to see change

Three things would make the global piracy picture meaningfully better:

UKMTO should publish an API. They already collect structured incident data. Publishing it as a JSON feed would cost them almost nothing and would give the entire maritime risk ecosystem better coverage for the Indian Ocean — arguably the world's most dangerous maritime corridor right now.
ReCAAP should stop publishing PDFs. Their data is excellent. The format is hostile to integration. A structured RSS or JSON feed with coordinates, dates, categories, and vessel details would be a gift to the industry. Some of their newer reports are machine-readable, most are not.
IMB should separate piracy from armed robbery. The IMO made the legal distinction for a reason. Lumping both into the same map marker, under the same "piracy" label, muddies the risk picture — especially for underwriters who need to know whether an incident triggers a piracy clause or a separate policy provision.

None of these changes are technically hard. They're institutional. The data exists. The organizations just don't expose it in a way that machines can consume.

We built ArcNautical's piracy pipeline to fuse the best available data from the best available sources. Right now, "best available" means IMB globally, ReCAAP for Asia-Pacific, and nothing from UKMTO. We're transparent about that because the gaps affect the scores, and anyone using those scores to make decisions should know where the data is strong and where it thins out.

If you work with piracy data in a different context — naval intelligence, flag state reporting, commercial AIS analytics — I'd genuinely like to hear what sources you use and what gaps you've found. The more we understand about what's missing, the more honestly we can represent what we know.

Explore the piracy data

Score any route against live piracy incidents, JWC zones, and 8 other intelligence signals.

Open ArcNautical