Wednesday, June 20, 2007

Thinking of removing your DST software config? Read this...

Thinking that this was no longer a large issue, I nonchalantly removed the DST software config from ~1300 users last week. About a few hours later I get some inquiries from the regional offices to the effect of:

"We had some people call about receiving a 'Permission's restrictions need to be updated, do you want to restart your device now or later' prompt on their handheld. They accepted it and the device restarted, but I had never seen this before, what is it?"

This is the "Oh s***" moment, when you realize that something seemingly innocuous had directly impacted each and every user you modified in bulk.

You could say this was pretty minor, their functionality was fine and all they needed to do was restart the device, but that type of stuff doesn't fly in our environment. Everything that directly impacts the user requires a change control approval process.

Time to move the function "unapply software configuration" from daily administrative task to change-controlled lockdown!

Apparently this happens when you initially followed RIM's recommendation and created a specific software policy for DST which allowed external communications. Removing this software policy reverts it back to some default policy which requires a restart, hence the user prompt.

Oh well, live and learn. (...and share the pain!)

Tuesday, June 19, 2007

Lotus Domino TimeDate Structure

This structure is used in many places within Domino, such as creation / modification times as well as database and replica ID's. Here is a diagram showing just what nuggets of interesting information are stored in this critical Notes data structure:


Tuesday, June 5, 2007

Analysis: "Transaction Error: Failure at Service"

Had this error pop up on a handheld today when trying to send from the device, got the dreaded red X. Receiving messages, however, was fine. Usually this boils down to one of a couple of issues:

1) BES server issue
2) Carrier issue
3) Service Book issue

Since the BES was fine and the carrier seemed to be OK for other users, it must have been a Service Book issue. Tried deleting and then undeleting the Desktop [CMIME] service book, didn't work. Tried deleting the Desktop [CMIME] service book, then pushing service books from the BES. Didn't work. Since lookups were also having a problem, I theorized that the odds of 2 service books (lookups are the Desktop [ALP] service book) getting corrupted was pretty small, so time to move onto the next possible culprit.

To take the carrier out of the picture, I hooked the device up to USB so that the messages would be sent using the BlackBerry Router over USB / LAN, totally bypassing the wireless carrier and thus getting that variable out of the picture.

Well I still got the Red X. Hmmm.... but this time it did not specify "Failure at Service". This time the sub-error was "Decryption Failure". Interesting: so now we are getting down to the root of the problem and I can definitely say it is not the carrier or the service books.

So I check the Dispatcher logs, since it is responsible for encryption/decryption to the device, and voila:

[30368] (06/05 10:41:18):{0xFC0} {User} Packet has been delivered to device, Tag=4758107
[40700] (06/05 10:41:44):{0xD8} {User} Receiving packet from device, size=127, TransactionId=1429352292, Tag=259213, content type=ITADMIN, cmd=0x3
[20209] (06/05 10:41:44):{0xD8} {User} DecryptDecompress() failed, Tag=259213, Error=604
[40275] (06/05 10:41:44):{0xD8} {User} Sending transaction error to device for transaction 1429352292, size=46, TransactionId=-966799068, Tag=4758202

So now I know it is an encryption issue, so let's generate a new key:

[40700] (06/05 11:05:13.616):{0x1468} {User} Receiving packet from device, size=225, TransactionId=1429352316, Tag=7143787, content type=OTAKEYGEN, cmd
=0x3
[30222] (06/05 11:05:13.616):{0x1468} {User} MFH: contentType=OTAKEYGEN, sizeOTA=180, sizeOTW=180, TransactionId=1429352316, Tag=7143787
[30308] (06/05 11:05:13.616):{0x1468} [BIPPa] {User} Forwarding data to BES Agent (S45776843), size=218, intTag=1911024, Tag=7143787
[30311] (06/05 11:05:13.616):{0x1438} {User} Forwarding status to relay, intTag=1911024, Tag=7143787, Status=1
[40279] (06/05 11:05:13.616):{0x1438} {User} SubmitToRelaySendQ, Tag=7143787
[30222] (06/05 11:05:13.616):{0x1288} {User} MTH: contentType=OTAKEYGEN, sizeOTA=14, sizeOTW=14, TransactionId=-2070675455, Tag=6675317
[30310] (06/05 11:05:13.616):{0x1288} {User} Forwarding internal data to device, contentType=OTAKEYGEN, routing=S45776843, device=23F7A64F, size=55, cm
d=0x3, ack=0, TransactionId=-2070675455, intTag=7486040, Tag=6675317, Submit=1
[40279] (06/05 11:05:13.616):{0x1288} {User} SubmitToRelaySendQ, Tag=6675317

Well it looks like it didn't do much, let's go back to the MAGT log and see what it did during this same time period:

[40076] (06/05 11:05:13.616):{0x920} {User} SendStatusToWirelessNetworkUsingSRP SendQ, TID=1911024
[40400] (06/05 11:05:13.616):{0x920} {User} Received datagram with content type OTAKEYGEN, TID=1911024, for user User
[40386] (06/05 11:05:13.616):{0x920} {User} {User} Sending message to device, Size=49, TID=7486040, TransactionId=-2070675455
[40081] (06/05 11:05:13.616):{0x920} {User} SendToWirelessNetworkUsingSRP - SendQ, TID=7486040
[30591] (06/05 11:05:13.616):{0x920} {User} {User} *** Activation *** transaction aborted.


OK, so it looks like when trying to generate an encryption key from the device, it goes through a mini Activation, however it aborts the activation, I am guessing, because it can't establish encryption with the device. So it's a catch-22.

Let's hook it up to the USB cable and provision the device & see what happens, theoretically this should just so the same things - resend the service books and regenerate the key - but do it over the wire. Let's look at the MAGT logs:

[30615] (06/05 11:09:33.130):{0x920} {User} Received RPC command [ProvisionUser] (id=1666)
[40000] (06/05 11:09:33.130):{0x920} {User} Synchronizing user configuration
[40000] (06/05 11:09:33.130):{0x920} {User} Synchronizing device capabilities
[40000] (06/05 11:09:33.161):{0x920} {User} ProvisionUser: Updated handheld capabilities info
[40000] (06/05 11:09:33.161):{0x920} {User} ProvisionUser: Handheld capabilities indicate AES encryption is supported
[40000] (06/05 11:09:33.161):{0x920} {User} ProvisionUser: Using key type AES
[40000] (06/05 11:09:33.161):{0x920} {User} Synchronizing AES encryption keys
[30000] (06/05 11:09:33.192):{0x920} {User} *** Activation *** User activated; triggering message pre-population
[30616] (06/05 11:09:33.192):{0x920} {User} RPC command [ProvisionUser] successful (id=1666)


Everything looks OK there, and now we can do lookups and send mail just fine.

What is different about the USB provisioning? Ideally I would like to be able to fix these types of problems OTA, in case someone is out of the office. I don't like the idea of having to hook up to the BES Manager and provision via USB to fix. Anybody out there know anything I could have tried OTA before hooking up to USB, without having to go through a full wireless enterprise activation?