Twitter Follow Button

Tuesday 14 May 2013

GW MMSI Encoding

In my previous post I discussed the GW protocol used by ships to send position reports and how this was decoded by Rivet the free and open source HF modes decoder. Now the ships identity wasn't encoded into the text string that contains the position report but obviously it had to be in the data exchange somewhere as without it how would GW know which ship was sending the report or which shipping company to pass it on to. My suspicion was that the ships identity was most likely encoded into a data packet called the Type 2 Subtype 101 which was sent prior to each position report. At that time when decoded by Rivet these packets looked like ..

18:10:05 GW Type=2 Count=1 Subtype=101 (10001011100101) (0x85,0x65,0x22,0xac,0x66,0x66)

Note the 6 bytes displayed as hexadecimal numbers within the brackets at the end of the message which is the payload of the packet (i.e the contents). This payload changed with every position report but had similarities so this had to be where the identity was located. Now it seemed to me that the ships identity had to be encoded in one of four ways ..


  1. The ships name
  2. The ships callsign
  3. A unique customer assigned by GW
  4. The ships MMSI.
Now as there are only six bytes in the payload that pretty much rules out the ships name and there weren't enough similarities in the payloads I was seeing for it to be the callsign. Which leaves us with the GW customer number or the MMSI. The most common way of encoding MMSIs into radio data is simply to convert them into 30 bit binary sequences. I took a few 2/101 packets and converted the payload into every possible number from the possible 30 bit sequences in them and used this website to see if any of them were valid MMSI numbers. None were , so that wasn't how they were encoded. Next I wondered if GW were using 4 bit BCD (Binary Coded Decimal) but if that were the case I would only see the numbers 0 to 9 instead I was seeing 0 to F (we are using hexadecimal here) so that wasn't it either.

 At this point I was pretty much stumped and was out of ideas. Thankfully this was when Alan W got in touch with a clever idea. Now just to introduce him Alan W is an ex ships radio officer who has been a co-conspirator of mine on various decoder projects starting with MPT1327 then taxi MDT decoder , DMRDecode and on to Rivet. Alan has an amazing ability to keep working on a problem in a methodical way even when it all seems hopeless. If he was many years older Alan would have been sat in a hut in Bletchley Park working out Enigma keys with pencil and paper but instead he does this. Alan's idea was that since we know the ships position we can use that (via various ship tracking methods) to identify the ship and from that reverse engineer the MMSI from the 2/101 packet. Obviously this doesn't always work as if a ships position report puts it in a giant container port then it could be one of several possible ships but this technique does work often enough to be useful. Now after many many weeks of effort and several phone calls/emails where Alan indicated he was on to something he delivered a bundle of paper work to me. In short Alan by looking at MMSIs and the 2/101 packet payload Alan had found some patterns. To give you a taste of this with this ..

0x00 in the packet represented the figures "33" in the ships MMSI
0x01 in the packet represented the figures "73" in the ships MMSI
0x02 in the packet represented the figures "13" in the ships MMSI

etc etc

In in each 8 byte sequence the 2nd hexadecimal number represents the first of the two MMSI numbers and the first hexadecimal number the second of the MMSI numbers. However he also found some oddities so for instance ..

0x62 in the packet could represent the figures "90" or "10" in the MMSI

From this I was able to put together a little table of the encoding method used ..

0x0 = "3"
0x1 = "7"
0x2 = "1" or alternate value "9"
0x3 = "5"
0x4 = "2"
0x5 = "6"
0x6 = "0" or alternate value "8"
0x7 = "4"
0x8 = "3"
0x9 = "7"
0xa = "1" or alternate value "9"
0xb = "5"
0xc = "2"
0xd = "6"
0xe = "0" or alternate value "8"
0xf = "4"

Which interestingly corresponds to the GW 8 bit alphabet where 0x20 represents "1" , 0x40 represents "2" and so on. Now at this point we could decode any 2/101 packet to a correct MMSI as long as the payload didn't contain a 0x2 , 0x6 , 0xa or 0xe as each could represent one of two numbers. So we had loads of decodes where MMSIs contained the number 1 which should be a 9 and so on. Again Alan was able to use his expertise to would out what was what and so on. Then a couple of days ago I realised what was going on  and how it all worked. Basically when deciding if you need to use the alternate numbers you look a digit ahead (going from left to right) and if the digit is 0x8 or higher you use the alternate number.

The best way of illustrating this is with an example from the 2/101 packet at the start of this page. Its payload is ..

0x85,0x65,0x22,0xac,0x66,0x66

which when made into a single large number is ..

85 65 22 ac 66 66

starting from the left the number 8 represents a 3 and the number 5 a 6 , these need reversing so the MMSI starts "63". Next the 6 represents a 0 (as it is followed by 5 which is less than 8) and the 5 represents a 6. So at this point the MMSI is "6360". Now the 2 represents a 1 (as it is followed by another 2 which is less than 8) but the next 2 is followed by 0xa which is more than 8 so the alternate scheme is used and the number being represented is 9. Now our MMSI is "636091" next we have 0xa which is followed by 0xc so again the alternate scheme is used so we have a "9" and the 0xc represents a "2". Now the MMSI is "63609129". Now MMSIs always contain 9 digits so we have one left to go which is a 0x6 which is followed by a 0x6 so represents a "0". Thus our full MMSI is "636091290" which happens to be a Liberian flagged container ship CMA CGM PARSIFAL.

So there you go that is how it was done and I hope you have enjoyed this. Sorry for rambling on a bit and I hope this reads clearly but it isn't always easy to put a process like this into words.

 Once again many thanks to Alan W for all his help.

 If you want to see the source code for Rivet this can be found here and a pre compiled JAR Java executable file can be download from here.

 For news of Rivet updates and new blog posts please follow me on Twitter.