Truncating UTF8 Input For Apple Push Notifications (APNS) in Ruby
… and check why 5600+ Rails engineers read also this
Truncating UTF8 Input For Apple Push Notifications (APNS) in Ruby
When sending push notifications (APNS) to apple devices such iPhone or iPad there is a constraint that makes implementing it a bit challenging:
The maximum size allowed for a notification payload is 256 bytes; Apple Push Notification Service refuses any notification that exceeds this limit
This wouldn’t be a problem itself unless you want to put user input into the notification. This also wouldn’t be that hard unless the input can be international and contain non-ascii character. Which still would not be so hard, but the payload is in JSON and things get a little more complicated sometimes. Who said that formatting push notification is easy?
Desired payload
{
aps: {
alert: "'User X' started following you",
},
path: "appnameios://users/123",
}
This is simplified version of our payload. The notifications is about someone who started following you on our fancy
social platform that we are writing. The path
allows the app to open it on a view related to the user who started
following. The things that are going to vary are user name (User X in our example) and user id (123).
Payload template
So let’s extract the template of the payload into a method. This will come handy later:
def payload_template(user_name, user_id)
{
aps: {
alert: "'#{user_name}' started following you",
},
path: "appnameios://users/#{user_id}",
}
end
Bytes, bytes everywhere
Remember when I said that we have 256 bytes? We do, but number of useful bytes for our case is even smaller.
payload_template("", "").to_json.bytesize
# => 73
Even when we don’t substitute data into our payload we are out of 73 bytes. That means we have only…
MAX_APS_BYTES = 256
def payload_arg_max_size
MAX_APS_BYTES - payload_without_args_size
end
def payload_without_args_size
payload_template("", "").to_json.bytesize
end
payload_arg_max_size
# => 183
… 183 bytes for user input
If your payload (required for the app to properly behave when the notification is clicked) is bigger or your message is longer you are left with even fewer bytes of user input.
Not everything can be truncated
But wait… We can’t truncate user id. If we did we could be misleading about who actually started following the recipient of the notification. So even though its length vary, we can’t truncate it.
We can see that the logic for this is slowly getting more and more complicated. That’s why for every push notification we have a class that encapsulates the logic of formatting it properly according to APNS rules.
class StartedFollowing < Struct.new(:user_name, :user_id)
def payload
# ...
end
private
def payload_template(user_name)
{
aps: {
alert: "'#{user_name}' started following you",
},
path: "appnameios://users/#{user_id}",
}
end
MAX_APS_BYTES = 256
def payload_arg_max_size
MAX_APS_BYTES - payload_without_args_size
end
def payload_without_args_size
payload_template("").to_json.bytesize
end
end
Truncating
Ok, we know how many bytes we have so let’s truncate our international string. But remember that we are not truncating
up to N chars, we are truncating up to N bytes! We can
use String#byteslice
for that.
It’s all nice and handy if we happen to truncate exactly between characters.
"łøü".bytes
# => [197, 130, 195, 184, 195, 188]
"łøü".byteslice(0, 4)
# => "łø"
But sometimes we won’t:
"łøü".byteslice(0, 3)
=> "ł\xC3"
We are left we one proper character and one byte which is ugly.
I’ve been looking long time to figure out how to properly fix it and it seems
that the right answer is String#scrub
. For those of you
who are stuck with older ruby version, there is backport of it in form of
string-scrub gem.
So if you ever need to truncate user provided utf-8 string and support international characters byteslice
+ scrub
will do the job for you:
"łøü".byteslice(0, 3).scrub("")
=> "ł"
Full solution
require 'string-scrub' unless String.instance_methods.include?(:scrub)
require 'json'
class StartedFollowing < Struct.new(:user_name, :user_id)
InvalidPayloadGenerated = Class.new(StandardError)
def payload
raise PayloadTooBigToGenerate if payload_arg_max_size < 0
payload_template(truncated_user_name).tap do |hash|
size = hash.to_json.bytesize
size <= MAX_APS_BYTES or raise(
InvalidPayloadGenerated.new("Payload size was: #{size}")
)
end
end
private
def payload_template(name)
{
aps: {
alert: "'#{name}' started following you",
},
path: "appnameios://users/#{user_id}",
}
end
MAX_APS_BYTES = 256
def payload_arg_max_size
MAX_APS_BYTES - payload_without_args_size
end
def payload_without_args_size
payload_template("").to_json.bytesize
end
def truncated_user_name
user_name.byteslice(0, payload_arg_max_size).scrub("")
end
end
notif = StartedFollowing.new("łøü"*100, 12345)
notif.payload
# => {:aps=>{:alert=>"'łøüłøüłøüłøüłøüłøüłøüłøüłøüłø
# üłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüłøüł
# øüłøüłøüłø' started following you"}, :path=>"appnameios://users/12345"}
notif.payload.to_json.bytesize
# => 256
Yay! We used our payload to full extent!
Troubles
I added this line size <= MAX_APS_BYTES or raise InvalidPayloadGenerated.new("Payload size was: #{size}")
at the end
just to make sure that everything is ok with my approach and catch errors early (and implemented tests as well). Lucky me!
In my case it turned out my json encoder was using numeric escape characters, so they way I calculated the size of my truncated size was wrong because in JSON it turned out to be bigger:
puts "łøü".to_json
# => "łøü"
"łøü".to_json.bytesize
# => 8 # 6 bytes for string plus 2 bytes for ""
vs
irb(main):059:0> puts "łøü".to_json
# => "\u0142\u00f8\u00fc"
"łøü".to_json.bytesize
# => 20
So I extracted the code responsible to truncating one string into a class
class TruncateStringWithMbChars
def initialize(string_with_mb_chars, maxbytes)
@string_with_mb_chars = string_with_mb_chars
@maxbytes = maxbytes
end
def call
string_with_mb_chars.mb_chars[0..last_char_id].to_s
end
private
attr_reader :string_with_mb_chars, :maxbytes
def last_char_id
string_with_mb_chars.
each_char.
map{|c| c.to_json.bytesize }.
each_with_index.
inject(maxbytes) do |bytesum, (bytes, i)|
bytesum -= (bytes-2) ; return i-1 if bytesum < 0; bytesum
end
return string_with_mb_chars.size
end
end
This algorithm basically iterates over every char, checks how many bytes it is going to take in our json payload and stops when we don’t have more space for our text. I am not proud of this code. Do you know a better way of how to do it? What’s they right way to check how many bytes a char will take if encoded as numeric escape character? I am sure there must be an easier way to do it.
Warning: It has a bug when maxbytes
is not enough for even one character to be left.
Multiple strings to substitute in notifications
The logic gets even more complicated if you want to embed in your payload multiple strings. Good example can be a notification like ‘UserX’ & ‘UserY’ invite you to game ‘Game’. We could use ⅓ of bytes for each substituted string in naive implementation. But I wanted the algorithm to be smart and work well even in case when some names are long and some are short. My algorithm for truncating multiple strings so that they all use no more than N bytes looks like this:
class TruncateMultipleStrings
def initialize(strings, maxjsonbytes)
@strings = strings
@maxjsonbytes = maxjsonbytes
end
def call
hash = @strings.inject({}) do |memo, string|
memo[string.object_id] = string; memo
end
maxjsonbytes = @maxjsonbytes
hash.
values.
sort_by{|s| string_json_bytesize(s) }.
each_with_index do |string, index|
maxjsonbytes_for_string = maxjsonbytes / (@strings.size - index)
shortened = TruncateStringWithMbChars.new(
string,
maxjsonbytes_for_string
).call
maxjsonbytes -= string_json_bytesize(shortened)
hash[string.object_id] = shortened
end
hash.values
end
private
def string_json_bytesize(string)
string.to_json.bytesize - 2
end
end
Be aware that it doesn’t favor any of the String. If they are all very long, then all of them will be allowed to use same amount of bytes. If any of the strings is short, then the unused bytes are split equally amongst the other strings.
TruncateMultipleStrings.new(
["short", "medium medium", "long "*30], 60
).call
# => [
# "short",
# "medium medium",
# "long long long long long long long long lo"
# ]
TruncateMultipleStrings.new(
["long "*30, "medium medium", "long "*30], 60
).call
# => [
# "long long long long lon",
# "medium medium",
# "long long long long long"
# ]
TruncateMultipleStrings.new(
["long "*30, "long "*30, "long "*30], 60
).call
# => [
# "long long long long ",
# "long long long long ",
# "long long long long "
# ]
Here is an example of class that could be using it
class GameInvited < Struct.new(:user1, :user2, :game_name, :game_id)
InvalidPayloadGenerated = Class.new(StandardError)
def payload
raise PayloadTooBigToGenerate if payload_arg_max_size < 0
payload_template(*truncated_names).tap do |hash|
size = hash.to_json.bytesize
size <= MAX_APS_BYTES or raise(
InvalidPayloadGenerated.new("Payload size was: #{size}"
)
end
end
private
def payload_template(u1, u2, g)
{
aps: {
alert: "#{u1} and #{u2} invite you to game #{g}",
},
path: "appnameios://games/#{game_id}",
}
end
MAX_APS_BYTES = 256
def payload_arg_max_size
MAX_APS_BYTES - payload_without_args_size
end
def payload_without_args_size
payload_template("", "", "").to_json.bytesize
end
def truncated_names
TruncateMultipleStrings.new(
[user1, user2, game_name],
payload_arg_max_size
).call
end
end
notif = GameInvited.new(
"User1 "*100,
"User2 "*100,
"Game "*100,
123457890123
)
notif.payload
# => {:aps=>{:alert=>"User1 User1 User1 User1 User1 User1
# User1 User1 User1 Use and User2 User2 User2 User2 User2
# User2 User2 User2 User2 Use invite you to game Game Game
# Game Game Game Game Game Game Game Game Game G"},
# :path=>"appnameios://games/123457890123"}
Urban Airship
Remember that if you are using Urban Airship you should be in total using even less than 256 bytes so they can provide you with tracking ability.
Quote from their documentation
The maximum message size is 256 bytes. This includes the alert, badge, sound, and any extra key/value pairs in the notification section of the payload. We also recommend leaving as much extra space as possible if you are using our reporting tools, as a portion will be used to help with response tracking if it is available.
Unfortunately I couldn’t find out exactly how many bytes they need for this functionality to work properly. If any of you have the knowledge, please let me know.
Storing notification templates on the phone
If your messages are particularly long (at least in some locales) you can spare some bytes by storing the template in the app and sending only the data.
You can display localized alert messages in two ways. The server originating the notification can localize the text; to do this, it must discover the current language preference selected for the device (see “Passing the Provider the Current Language Preference (Remote Notifications)”). Or the client application can store in its bundle the alert-message strings translated for each localization it supports. The provider specifies the loc-key and loc-args properties in the aps dictionary of the notification payload. When the device receives the notification (assuming the application isn’t running), it uses these aps-dictionary properties to find and format the string localized for the current language, which it then displays to the user.