PHP - Setting the correct Cache Headers (client side caching)
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I have a PHP app which creates images on the fly. Once the images are created for the first time, they are stored on a cache folder. Upon retrieving them from the cache folder I need to output the correct cache headers for proper client side caching.

I have no idea what's the right way of doing it. Requirements:

  • If image was recently modified, it should replace the cache of the user, and image should be requested from our server again.
  • If image was not modified since last request of same image, it should send the right headers so image is fetched from client-side cache without requesting from our server.

Please investigate on:

  • Should we use ETAG and EXPIRE headers?
  • Should we use NOT MODIFIED header?
  • Should we use CACHE CONTROL PUBLIC or CACHE CONTROL PRIVATE?

This is all so confusing. I hope a developer experienced in this can solve this for me in five minutes avoiding having me to do all the research. Thanks!

This is my current (incomplete) function:

/**
  * Passes through image from cache folder to render. 
  * @param string $fileName
  * @param string $fileExtension
  */
 public static function renderImageFromCacheFolder($fileName, $fileExtension){
     $filePath = self::getImagesCacheFolderPath(). $fileName . ".". $fileExtension;

    //Modify last access time by using touch, so when we run cleanCache we can delete all images not accessed in the last 30 days
   touch($filePath)

    header("Content-Type: image/$fileExtension");
    header('Content-Length: ' . filesize($filePath));

    //Client side caching
    header('Last-Modified: '.gmdate('r', filemtime($filePath))); 

    echo file_get_contents($filePath);
}

`

Bounty: Please fix my function and correct headers for client side caching using best practices.

References:

awarded to alv-c
Tags
PHP

Crowdsource coding tasks.

2 Solutions

Winning solution

I think changing your touch call to touch($filePath, date('U', filemtime($filePath)), time()) should do it.

I explained how it works at your other bounty.

Edit:

I edited your method. Code is commented so I think it's pretty easy to understand how it works

Edit 2:

Here is what I think is the final version

Edit 3:

Here is your code modified

Edit 4:

final code

What about etag, public caché and all the other headers. Why are they necessary or not? Could you please post a complete solution? Thank you very much!
georgefountain 18 days ago
Anyone can clarify what do with etag and the other headers? Thanks
georgefountain 16 days ago
Edited solution!.
alv-c 16 days ago
Hi Alv! Thanks so much for editing the solution. It is all clear except one line. if (@strtotime($SERVER['HTTPIFMODIFIEDSINCE'])==$lastModified || $etagHeader == $etagFile) Are you sure about this line? What if the cache file was modified (filenames don't change)? The Etag woould be the same. Did you mean to put the AND (&&) operator?
georgefountain 16 days ago
Shouldn't the code be like this? if(isset($SERVER['HTTPIFMODIFIEDSINCE'])){ if(@strtotime($SERVER['HTTPIFMODIFIEDSINCE'])==$lastModified){ $notModified=true; } else{ $notModified=false; } }elseif($etagHeader == $etagFile){ $notModified=true; } if($notModified){ header("HTTP/1.1 304 Not Modified"); exit; } If so, please modify the file. Or explain. Thank you so much!!!
georgefountain 16 days ago
Edited again. Sorry for that error!. Now it uses the defined variable $ifModifiedSince, that was set before. And also left a comment explaining the use of || instead of &&
alv-c 16 days ago
Thanks Alv for the quick reply! I just edited your last answer, which what I was trying to say. Easier to read on the pastebin. Please kindly check it, and if ok, then I can award the bounty to you :) http://pastebin.com/z4pdkCvx Thanks again!
georgefountain 16 days ago
The concept is, only check for etag, if $ifModifiedSince is not set. Because since the filepath never changes, even if the file is modifies the md5 will be the same. That is what I was trying to say.. although I am not sure if my code logic is correct.
georgefountain 16 days ago
Edited!. Changed your code to use $ifModifiedSince instead of $_SERVER['HTTP_IF_MODIFIED_SINCE'] (and btw, you wrote $SERVER['HTTPIFMODIFIEDSINCE']).
alv-c 16 days ago
Thanks! I got an idea!!! What if we make the Etag hash include the modification date? Check this out: http://pastebin.com/EJ5z0CdV . Now I believe the code is perfectly safe and an ETAG will only be the same if it is referring to the same filepath and modification time.
georgefountain 16 days ago
What do you think?
georgefountain 16 days ago
Hmm. I think you should keep both. Here's a good explaination why use both http://stackoverflow.com/a/500103
alv-c 16 days ago
Oh but I didn't remove ETAG. I just changed the way the ETAG was generated and included last modification date. This was my only change: $etagFile = md5_file($filePath . lastModified); ... what do you think?
georgefountain 16 days ago
Oh. Then I think it's perfecty fine
alv-c 16 days ago
Great! So this is our final code: http://pastebin.com/EJ5z0CdV ?
georgefountain 16 days ago
Edited solution with final code. In your code there was an error at line 19. missing $ sign at lastModified variable and you concatenated the last modification date to the path of the image, but you needed to concatenate the last modification time to the result of calling md5_file of the image path. As I said, code is now in solution as final code
alv-c 16 days ago
Thank you Alv! :)
georgefountain 13 days ago
What do you think about changing public cache to private cache? Thanks
georgefountain 13 days ago
Client will cache the response either with public or private cache. But private cache doesn't allow proxies to cache the response. See more http://stackoverflow.com/a/3492459
alv-c 12 days ago
Thanks! Another question, why are you ignoring warnings on strtotime(? What are the expected inputs that could cause a warning and why you added the @?
georgefountain 11 days ago
Let me know about this when you can. Thanks :)
georgefountain 11 days ago
In this case, the input comes from reliable source so there's no possible error. But errors can come from bad time parameter, or php.ini missconfiguration
alv-c 11 days ago

Caching is simple; I think this isn't a question that can be solved with code, you need to get the ideas right in your head, then you will be able to write the code yourself.

Basically whenever you want to think about caching stuff; think pull, not push.

Lets take for example the caching of a CSS stylesheet;

<link rel="stylesheet" href="//site.com/css/base.css?v1" />

Basically within your public_html directory you would have a folder named css with a file names "base.css?v1"

Want to update your stylesheet?, you add a new file containing the updated css and name it "base.css?v2", with an unique etag and a super long expire date; then you would update your (dynamically generated) html (that isn't cached, or at least has a super short expire date) to encompass the new filename.

You generally only want to send a NOT MODIFIED header in respond to a browser request containing; If-Modified-Since / If-None-Match .

You generally want to use "CACHE CONTROL PUBLIC " for non-sensitive resources (stuff thats shared across all users). For per-user items you want to use "CACHE CONTROL PRIVATE"

I hope this clears things up for you. Never try to "push" changes to the client; just use html as a pointer towards the new updated file.

edit; if you accept this answer, then i will start working on making a demo of these best practices in action which i will post in return for a 10$ bounty

edit2;
Looking at the earlier answer; its garbage.
Using php to do cache busting (invalidating old versions of files) is amateur hour.
Version-based cache busting using html as explained above is industry standard.
Web servers and most application servers simply ignore query string parameters that they’re not interested in. But by default, web browser caches have to assume that any difference in the query string will have influence on the result, and have to treat every change in URL as an unique object basically.

edit3;
Version changing filenames can be part of your build process or can be made obsolete by the clever htcaccess trick on the bottom of this page; "In this way a request for foo.123.css is processed by the server as foo.css -

http://stackoverflow.com/a/23604412

edit4;

[email]
<link rel="stylesheet" type="text/css" href="http://yoursite.com/base.css">

<img class="imageB" src="http://yoursite.com/image.php" width="50px" height="60px" />
<div class="imageA"></div>

<br/>
<br/>
<br/>
<br/>
l33th@x0r

<h1>Oh hai</h1>
[/email]

;

[base.css]
.imageA {
    background-image: url("http://yoursite.com/image.php?v1");

    width:50px;
    height:60px; 
    top:0px;
    left:0px;
    position:absolute;
} 

.imageB {
    top:0px;
    left:0px;
    position:absolute;
} 

body {
    background-color: lightblue;
}

h1 {
    color: navy;
    margin-left: 20px;
}
[/base.css]

;

[image.php] 
<?php

//base64 encoded binary of png file containing the letter A in white on black BG
$A = 'iVBORw0KGgoAAAANSUhEUgAAADIAAAA8CAIAAACrV36WAAACV0lEQVRoge2XIejqQBjAZUGGjL/NICIGEcMYRqNBxGAQ08KQBRGTmMxmg4iI0WAQ4zAsGETGMCyIGE0LQ0REhhwGGdsLB8dQ33P6Nn087pe2u++7++2+ufN8PgwGg8FgMBjMP4uiKJaN+Xz+bSOfL5lMWnfE43FXBifezuQ47r6xWq3+hYwbqKoKV2g6na7Xa3h9OBz8fv/XnDKZDCpcqVTKZrPolmXZr2kNh0MocTwe4fLIsgxbFovFd5xIktR1HUoMBgPYWCwW0YIlEokvaLEsiwzS6TRqR29bp9P5gpYoinD67XZrb2+1WjeV/RyhUOh6vcLp2+22vSsWixmGAbsefj48pNFooApms9mbXkmSYJckSR/VWq1WcGIAwH2lKpUKkk4mkx9yomkazSqK4n3Az8/P5XKBAd1u90Na7XYbadXr9Ycxk8kEBpxOJ5IkP6GladrTGhUKBRRTLpc9d7LvMKqq/i6MIIj9fg/DZFn2XGs0GlmvQ9O0h06BQOB8Pr+h1ev1PNTiOO4NJ8vrF382m72nZVkWz/OeOIXDYbSrGIYRiUSepkSjUZSyXC490Wo2m+jRnZ8j5vM5ymIYxn2tzWbzRkV4nkdZ/X7fZadUKoVGBwBQFOUwkaIoAABM1HU9EAg4n/T5ycf+pRYEAQDgcGgAgCAI8DoYDLr5H58giN1uh1Yrl8u9lJ7P51GuoiiuadnH1TSNIF47V948VSqVcpr45257BcfjsWmaL2mZpjkej9FtrVZ7Kf0x9nfWend3YxgGjXA+n53/YjAYDAaDwWD+D34BA1bn3VhhiaEAAAAASUVORK5CYII8YnIgLz4KPGI+V2FybmluZzwvYj46ICBpbWFnZWRlc3Ryb3koKSBleHBlY3RzIHBhcmFtZXRlciAxIHRvIGJlIHJlc291cmNlLCBzdHJpbmcgZ2l2ZW4gaW4gPGI+L2hvbWUvaGJlMDAyMzEvZG9tYWlucy9kYWtsYW5kLmJlL3B1YmxpY19odG1sL2Rldi9pbmRleC5waHA8L2I+IG9uIGxpbmUgPGI+MzQ8L2I+PGJyIC8+Cg==';

//base64 encoded binary of png file containing the letter B in white on black BG
$B = 'iVBORw0KGgoAAAANSUhEUgAAADIAAAA8CAIAAACrV36WAAAB10lEQVRoge2WMYvyMBjHc6GUDk7FIYM4iIOzFAdxdHJwEkc/gJ9ARMRJHMSP4FA6+RE6OImDg5OIQxEnkY6llJAWcsMdQe59OUzvooV7flOg/6f50TxNghAAAAAAAACgkrcHc5zz7wNxHEdR5Pv+6XTabDar1epyufzU7hEtKRhj0+nUMIxsaX2w3W51Xc+cFufctu0naf37FGNsGAYhpFarDQaD6/V6n280Gq/R+gIhxPO8Z3wwKS2EULfbFXmFv6SsVj6fF3lKqex0WN7wIRhjYhyGoWy5Kq37Nt/tdopmkW754/Eo8p1O52VamqaZpmlZ1mQyud1uIrxer1U5obTb6X6/N00zQ1qU0vl8nvpMVNXyGONKpWJZlqL3f5JuETnni8XiSVr/DWCMc7lcuVxut9uO4zDGRH42m71M6wv1ej0Igo98kiTVajUTWgih4XAoSpbLZVa0SqWSKPE8Lytauq6LkiiKpKZTtUEghAqFghjLntYKtXq9nhirunLJLmKz2aSUipLxePwyLU3TCCGtVsu27SRJRJ5SWiwWlWvJMhqNlDj9RMtxHIyVdXAKId/3+/1+uum03/KO4zgMwyAIzufz4XBwXdd13fsbPQAAAAAAwN/kHW2Ci4FDVaDXAAAAAElFTkSuQmCC';


//Check if we are cache busting using version strings
$explode = explode("?v",$_SERVER["REQUEST_URI"]);


//if version string available show letter B
if (isset($explode[1])) 
    $image = imagecreatefromstring(base64_decode($B));
//if version string unavailable show letter A
else
    $image = imagecreatefromstring(base64_decode($A));


header('Content-Type: image/png');
imagepng($image);
imagedestroy($image);

?>
[/image.php]

So yeah i made a small edit to demonstrate my point; you can test the code most it easily by signing up at https://litmus.com/coupon/twittertrial (this link allows you to get screenshots of your html emails in lots of different email clients; my link allows for a 7 day trial without any creditcard details requirements); and just uploading base.css & image.php to your website; double check all paths. You will see a significant amount of widely used platforms display B while also quite a lot displaying A.

You could make it so that you serve a transparant 1px png if the other image already loaded (to save on bandwith) . Also when the external css based cache busting doesn't function you basically just fallback to the unpredictable ETAG/EXPIRE based php cachebusting approach as described by alv-c.

Note; it seems like in the html/CSS imageA & imageB are basically the opposite of my naming scheme in the php file; haven't had time to clean things up.

Hi 5os! Thanks for your contribution. Up to now Alv is the most complete solution and the one that most completely addresses the question. You haven't submitted a complete solution yet. However, there are are some interesting concepts in your answer and so I want to know more. For example, switching to CACHE CONTROL PRIVATE seems like it might be a good idea. However, the rest I believe it doesn't apply for this use case .We are generating images on the fly, thousands of images might be generated at any time with different parameters that change according also to user settings. There is nothing on HTML to be adjusted and HTML cannot know anything about the images, since when the images are served in emails, for example, there will be no PHP.
georgefountain 13 days ago
So all the logic must be done when serving the image, on the PHP code. If you can do anything to improve the current solution I will award you a tip. If you find a revolutionizing complete solution as you mention, I will award you the bounty.
georgefountain 13 days ago
uncached CSS from a external url can be loaded in a static html email and be used to provide a dynamic pointer. Will make demo and update my answer, just give me a minute.
5osxcwbf 13 days ago
Please have in mind: HTML emails do not use external CSS. They use only inline CSS, tables and html4 (to achieve global email client support)
georgefountain 13 days ago
External CSS isn't supported in some clients (web client of gmail etc) but it is supported in several widely used desktop clients; (80% of people using gmail don't use web clients). But the same is true for the ETAG/EXPIRE route; its support is also unpredictable, the reason high traffic sites use base.css?v1 type of approach is because its gives you a lot more control; than the unpredictable ETAG/EXPIRE approach. So yeah the best approach is probably to a combined approach with (parts of) alv-c's answer as a fallback; that way you get the best of both worlds; I will update my post with demo code for a hybrid solution.
5osxcwbf 12 days ago
Check my edit; I could clean things up (When i have time) but it would be nice to first see an accepted answer / tip.
5osxcwbf 12 days ago
Hi 5o. Thanks for the reply but I honestly don't understand what you are trying to prove or do. As I explained before we are talking about thousands of images created dynamically which change according to user data. Why then append a manual "version" parameter? How is the system supposed to handle that? This is not the case of a single CSS being updated manually. I don't understand either what you are trying to to prove with your code above... CSS support on email?
georgefountain 12 days ago
I gave a rudimentary demo; think bigger; every email recipient loads its own CSS; obviously the CSS isn't updated manually. you could for example md5 hash the email address and so every client loads [hash.css] (and use php to dynamically generate the css while also keeping nice logs about email opening times etc) The versioning stuff would also updates automatically once change in image is detected. I assumed these things where obvious but perhaps i should have been more clear :]
5osxcwbf 12 days ago
Sorry 5os, but the solution you are providing is out of the scope of the asked question and it is extremely complicated. I haven't even understood what you mean. Servicing a unique CSS with a different hash for each user? That would mean making a major change on the system just for caching? And what if the image is used outside that css and is used somewhere else? If there is anything you can add to Alv's solution I will be happy to award you a tip; but at this point I believe you solved a different question.... The use case I am having is very different from what you are trying to propose.
georgefountain 11 days ago
I think i have wasted enough time answering this question; You will find out for yourself why all high traffic sites do cache busting using filename versioning and only use etags/time expiration (if they use it at all) as fallback, last resort option. Everything you need to know is in my answer. No-one likes stale content that results from unfrefreshed caches generated that result from flaky etag/time expiration cache implementations (all clients handle these things differently) Don't blame me for the people the people that won't longer use your platform; blame your own (and alv-c's) amateuristic understanding of caching & blatent disregard of industry standard best practices. Hope you have a great day.
5osxcwbf 11 days ago
View Timeline