For mobile apps is it faster to use s3 or pusher.com
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I have a need to upload and download images almost like a chat app, meaning as close to real time as possible. I am torn between a model where i upload an image to s3 and put the location info in a message to pusher.com vs just breaking the image in to many parts and sending them via pusher.com and letting the receiver put them back together.

I am open to either ios or android as a test for this scenario.

Lets assume the image is 100kb under both s3 and pusher scenarios.

Forgot to say, the pusher.com model is nice because its a single system to interface with and interact with so our preference is in this direction but there is some question about the overhead of message chucking. That said the chunking we have here is different from a post to S3 and then a post to pusher.com
Qdev 9 years ago
notes for some replies:==I think the messages would be guaranteed to be in order. If not they would for sure have time date stamps to imply order. ==For connecting to s3 (we need to assume hitting a server for auth token), passing auth, and then uploading the image, and then making a pusher message to let the other systems know s3 was now done is not insignificant work. This is the precise reason we are debating and looking for an experiment around this. Our feeling is that the easy solution is s3 + pusher but there are strong arguments that would suggest that a stream of socket communication would be a lot faster on mobile.
Qdev 9 years ago
@Quotient so you are looking for code that allows you to test both scenarios to see which one is quicker?
alex 9 years ago
Yes you got it.
Qdev 9 years ago
awarded to alex

Crowdsource coding tasks.

5 Solutions


Chunking the message could get a bit messy, what if the chunks come out of order for instance? With S3 you should have no issues with delay, just post the image data, send the push with the URL and it will be available immediately for the client.


Using s3 would be better for two reasons, reliability and speed. Using s3 would only require 1 request while using only pusher would require the client to wait for (at least) 10 messages because each pusher message is limited to 10kb.

This becomes a problem in the event of a connection interruption. If you're using s3, the client can easily redownload the image while if you're using solely pusher, getting the parts for the image would be harder. Also, making one HTTP request is much quicker than waiting for 10 messages over a push channel. Thus, using s3 would be better because it is both more reliable and faster than relying solely on pusher.


Pusher is good now But if you have other future plane which also increase 100kb to 1000kb or more, than Amazon S3 is preferable.


You're probably expecting to get a native iOS/Android benchmark, but unfortunately I can't help you with that. I also don't have an AWS account, so I decided to fix the bugs I found in the code @alex provided as well as benchmarking an asynchronous push implementation.

Splitting the Image into 10KiB Chunks

$data = base64_encode(file_get_contents($path));

foreach (str_split($data, 10 * 1024) as $chunk) // 10 * 1024 bytes = 10240 bytes = 10KiB
{
    // $already_encoded = true, otherwise we would have to split by (10 * 1024 - 2) bytes
    $pusher->trigger('test_channel', 'my_event', $chunk, null, false, true);
}

Target Image (73764 bytes / 72 KiB):

target image

Encoding

Since the Pusher REST API only deals with JSON requests, the image binary data needs to be encoded (this might not be the case of the WebSocket interface, I haven't tested that). Base64 is the most efficient way of encoding the binary data, but it effectively inflates the data by 33%, meaning that our post data is now precisely 98352 bytes.

Benchmark

Like I said above, I benchmarked the bug-fixed normal version (aka pusher_upload) and I've also modified the Pusher library to support concurrent requests with the curl_multi interface (aka pusher_upload_multi). I've made sure that all the requests were actually being made and were returning successful status codes / responses. The requests also arrived in the same order in which they were sent (at least according to the Pusher Debug Console) and confirmed by checking the first / last bytes each message sent / received. Here's the result:

[14] => Array
    (
        [pusher_upload_multi] => Array
            (
                [absolute] => 2.1296268020357405
                [relative] => 1.0000000000000000
                [xhprof] => Array
                    (
                        [cpu] => 56 000
                        [time] => 2 226 599
                        [memory] => 25 348
                    )
            )

        [pusher_upload] => Array
            (
                [absolute] => 6.5347417933600287
                [relative] => 3.0684915249532811
                [xhprof] => Array
                    (
                        [cpu] => 64 000
                        [time] => 6 033 424
                        [memory] => 18 452
                    )
            )
    )

[arguments] => a:1:{i:0;s:15:"./macchiato.jpg";}

Basically, this says that each method ran 14 times (cycling each one, so it's fairer); that the pusher_upload_multi method took on average 2.13 seconds per call while the normal pusher_upload method took 6.53 seconds (307% slower). Note that the pusher_upload_multi implementation was doing 10 concurrent request.

I can post the code for the benchmark, but I don't think that's interesting (see note below).

REST vs WebSockets

I'm convinced that you weren't considering using the REST API as the way to chunk and upload images, either way, just keep in mind that the WebSocket version would most probably be faster.

Note on Pusher Limitations

While reading the docs I noticed two important drawbacks:

  1. each pusher message has a limit of 10 KiB has @alex referenced, and
  2. you can only trigger 10 events per second, more than that and Pusher drops the event

Means that images >= 100 KiB cannot be reliably uploaded or downloaded in <= 1 sec, and thus, they need to be throttled. And if you consider the base64 encoding inflation (like I said, not sure if it also applies to WebSockets), you're effectively bringing that size limit down to 75.18 KiB (100 KiB / 1.33). I don't know about your needs, but that sounds like a deal-breaker.

(Considering Just Speed) vs AWS S3

Judging by the values @alex posted, and considering that he only uploaded (to Pusher) 12800 bytes out of the 136260 bytes that it would actually take to upload the image integrally, I suppose the S3 test would outperform the Pusher + S3 one (not by a lot, but still).

@alixaxel doesn't str_split() count by the number of characters? Thus, shouldn't 10 * 1024 be 10,240 characters = 81,920 bytes = 80 kb?
alex 9 years ago
@alex: How come 1 character = 8 bytes? I think you're confusing with: 1 character = 1 byte = 8 bits. And str_split() splits by bytes (all the "common" / ASCII characters have just 1 byte however).
alixaxel 9 years ago
@alixaxel Whoa! Thanks for catching that.
alex 9 years ago
@alex: No problem! ;)
alixaxel 9 years ago
Solid catch on the rate limiting. That is for sure the deal breaker on this method. Gosh I really wanted it to be pusher as the .winner
Qdev 9 years ago
@Quotient: So did I, too bad. =\ Thanks for the tip!
alixaxel 9 years ago
Winning solution

TL;DR Difference is still negligible, but S3 is slightly quicker

Overview

Okay, so here's how I approached the problem. I started off creating a PHP script which would act as the "server." It accepts two GET requests, ?action=s3 and ?action=po. The first request sends an image through Amazon S3 and then uses pusher to send the url of the image to the client. The second request sends a part of the file (base64 encoded) through pusher to the client.

Results

UPDATE
Turns out I was doing my math wrong when splitting the files up (Thanks @alixaxel), also I was forgetting to factor in the 4/3 increase in size when using base64. I have updated the code and re-ran the tests.

Total (Pusher Only): 11.944999933

Total (Pusher + S3): 11.023000240

END UPDATE

Total (Pusher Only): 4.058000564

Total (Pusher + S3): 4.794000149

Downloads

UPDATE Here are the files for the updated test.

Here is a download of the files used in the test.

500kb Test

Result: S3 wins by far.

Using a 500kb image now requires 67 pusher requests if you're relying solely on pusher. I only did three trials here as the winner was easy to determine. Here are the files which were used.

Total (Pusher Only): 23.554999828

Total (Pusher + S3): 4.348999977

I was thinking the exact same thing, but I don't have an AWS account. Anyhow, I took a quick look at your code and a couple of things caught my eye, perhaps you would like to review these:
alixaxel 9 years ago
1) the way you're timing the methods is very volatile IMO - it has unnecessary overhead that may contribute in favor of one or another. Ideally, I think you should use/return the request response times directly from cURL or similar.
alixaxel 9 years ago
2) in pusher, you're converting the image data to base64 which makes it 33% larger - is this really necessary? You also mentioned the maximum pusher chunk was 10kb in size, I'm probably missing something, but I don't understand the 1280 there: 10KiB = 10240 bytes -> 10240 / 10 = 1024, no? Depending on the image size, you may also be performing unnecessary requests (always 10) that slow down the pusher method.
alixaxel 9 years ago
3) on a follow-up note, 1280 bytes is 1.25KiB, far from the limit of 10KiB pusher accepts.
alixaxel 9 years ago
4) although that's not part of the Pusher class, it would be possible to make the Pusher::trigger() method use the curl_multi functions to get a considerable speed improvement. I'm just saying this because I don't know the iOS/Android Pusher API, and they may contain methods for performing concurrent pushes.
alixaxel 9 years ago
5) you're only sending ~12.5% of the image to Pusher (102451 bytes (image size) / 10 = 10245.1; $messsage_len = 1280; 1280 * 10 (requests) = 12800 (bytes))?
alixaxel 9 years ago
Alex, thanks for the trial. this is exactly what i was looking for. So if I'm understanding this right pusher is the faster method at this file size? Can we make the updates mentioned around 10% of the file and base64. It seems like this should also speed things up a bit on the pusher side.
Qdev 9 years ago
@alixaxel 1) Yea, I think it's not perfect and has some parts which could be worked out better 2 and 3)The 1280 is referring to the number of characters in the data sent thus each message is still 10kb (1280*8=10,240). It doesn't matter if I'm sending it as base64, the data that is going through is still 100 kb 4) not sure about that either, thus I took the safe way and made it individual requests 5) mentioned before
alex 9 years ago
@Quotient: After reading the code, base64 really is necessary if you want to work with the REST API. But you can shave an extra two bytes per chunk with $already_encoded = true! =P
alixaxel 9 years ago
@alex: Ah! That's where the 1280 comes from! You're confusing bits with bytes.
alixaxel 9 years ago
Alex for the sake of knowing how this scales can we see 500kb? Lastly are the pusher chunks sent in parallel or serial? Thx guys!
Qdev 9 years ago
@Quotient: The upload requests @alex did are in serial, but for a 500KB image it would take, at least, 5 seconds to upload and 5 other seconds to download (even with parallel requests). Pusher doesn't seem tailored for this.
alixaxel 9 years ago
@alex: Damn, what a smackdown!
alixaxel 9 years ago
@alixaxel Haha, I was expecting them to be close, but wow, the 500kb result was surprising
alex 9 years ago
View Timeline