Some things to think about (feature req.)

kvdb

Joined: 2002-10-10
Posts: 29
Posted: Thu, 2002-10-31 20:41

I found myself thinking about a few gallery features during the week. Unfortunately, I have no idea of how to implementation such difficult plans. It sure won't make gallery management any easier.

Ok, here's an example why you should want this feature:
You've got a hosting provider that allows you to put 100MB online. In G1, you had your original picture, a resized one and a thumbnail. They were all three in the database, always!

There are a number of reasons for not wanting the original (or maybe only the original and not the resized one) online.
1) You don't have the diskspace
2) Nobody would want to look at the high-res scans of your vacation photo's but you still want to keep them in your gallery for yourself (for a reprint maybe).

Possible solutions can be:
Let G2 use two separate databases (on different computers or harddisks).
Example:
One database is put online for user to see. It is located on the 100MB webhost. The other one resides on the admins own home computer with lots of storage. Only the admin is able to see the photo's in the last database.

It would be almost too cool to combine this feature with the last one I suggested (topic: Here are some nice features :smile:).
Example:
This example is exactly the situation I would like to have.
I've got 100MB webspace on a fast internet connection (location A). In addition, i've got my own webserver on a slow cable modem (15kB/s upload) (location B).
All pictures with a depth level higher than 50 (uninteresting pictures) should be in the database at location B. All interesting pictures (those with a depth level from 0 to 50 assigned to them) should be at location A.

I guess this is an administrators nightmare, but it sure would be fun! :grin:

Cheers, Kees

by the way, when can I start translating gallery in Dutch? Does it already make sense to do it now?

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Thu, 2002-10-31 21:38

Managing two sets of data would be challenging, and I don't think that it's worth attempting. However, we can simulate your desired behaviour with something like this:

1. You install Gallery at home and put all your original images on it, organized in any way that you choose.

2. We implement a RemoteDataItem class on the server side (which can be done via an add-on module) that allows you to specify the source image as a URL, instead of uploading the file. Gallery stores this URL only on the limited capacity server, and then generates all the derivative images (resizes, thumbnails, highlights, etc) from the URL version.

If you move the location of the source images, you'll break all your derivative images (though we could write some code to help you fix that).

None of this will happen in G2.0, but maybe in G2.1 (or sooner if somebody other than me is going to write the module).

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Tue, 2003-12-23 06:57

@Mirroring:
IMO you should have several types of mirror configuration: fully automated (similar to the one in gallery 1.4, just global info what the /album mirror URLs are) .... atomic/fully manual, per album settings,... (requires more work for the admin, but he gets more flexibility).
even the "derivatives" make a lot of bandwidth if requested million of times, that's the argument.
i don't know the code of gallery 1 syncing, but i guess g1 compares for each requested page the local serial.xy.dat[/album.dat/photos.dat] with the mirror. and the serial.xy.dat gets incremented even for click counts.
the power admin user should be able to decide when a mirror is outdated. if he wants no serial.xy.dat check before each page request, if comments, click counts, caption texts,.. outdate a mirror and so on.
all meta data is always saved in the local DB and all high bandwidth related objects should have the ability to be elsewhere/mirrored.

perhaps that violates the concept of G2's rock stableness / data integrity. perhaps you could add a state to an album "noImageManipulations". In that state people can view the forum, write comments put nobody is allowed to manipulate, upload, move or delete pictures in the specific forum.
And only during manipulation of the album G2 will do syncing with the mirror, check what images are there, what note, what's the state. If you want to add pictures to the mirror, you have to generate everything on the local machine and then G2 could let you downloade/export them to a mirror.

Don't know whether this could be done in a module or affects the core to strongly.

Thanks for your time - Andy

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Wed, 2003-12-24 22:50
valiant wrote:
i don't know the code of gallery 1 syncing, but i guess g1 compares for each requested page the local serial.xy.dat[/album.dat/photos.dat] with the mirror. and the serial.xy.dat gets incremented even for click counts.

There's a bug in 1.4.1 where we update the serial file on every click count, which makes the mirroring code pretty worthless. This is fixed in 1.4.2.

valiant wrote:
the power admin user should be able to decide when a mirror is outdated. if he wants no serial.xy.dat check before each page request, if comments, click counts, caption texts,.. outdate a mirror and so on.
all meta data is always saved in the local DB and all high bandwidth related objects should have the ability to be elsewhere/mirrored.

I agree that we need to get more control over the mirror. Right now we check before every request to see if the mirror is up to date, but that can be very inefficient if the mirror is actually down or not responding (which is a common complaint).

Mirroring in G2 is going to be a more difficult problem than in G1. The issues we have with G1 about determining the state of the mirror can mostly be fixed pretty easily. For example, we can avoid checking the mirror every time; check it once in a while to see if its up to date and if it is, then we use the mirror until we make a change that would affect the mirror again. We can also mirror/checksum and checksum at the original/derivative level instead of the album level which will give us greater control. Ie, modifying one image in an album doesn't mean the whole album has to come from the source -- just that one image.

The problem with mirroring in G2 is that we cannot allow direct access to the images. We have to maintain the same permissions on the mirror as we have at the source. The source has the benefit of the full database, however, whereas the mirror doesn't have any of that. I still have not yet come up with a good plan to allow the browser to request images from the mirror and have the mirror deliver them without the mirror having to have an entire copy of the database. I've considered plans including doing things like embedding a signature of some kind in the url so that the mirror can verify that the browser is allowed to see the image. I'd rather that the source and the mirror did not have to contact each other during this process as that leaves us open to the problem where the mirror is down, etc, but I don't know of a way around that.

Thoughts?

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Thu, 2003-12-25 00:10

1. Creating the mirror:
- In Gallery2 you click "create mirror signature file"
This generates a textfile for example album.dat.
In this textfile there is for each image of this album the filename/id and a timestamp of last modification on this image or its derivatives.
- You put all images/derivates of this album + the signature file on the mirror.
-> The admin is responsible that no-one alters the files on the mirror.

2. Syncing:
The only time you have to check whether the mirror is in sync for a specific album is when you modify the source images/derivatives.
If the source has been modified and there is a mirror for this album, Gallery2 will check the signature file on the mirror (public readable, no problem) and display what images have been modified, deleted and added.
It will also generate a new signature file and the admin is responsible to add/delete/replace modified files + replace the signature file on the mirror.
Perhaps, you could make it a 2 step work flow. After the modification of the source you get the diff to the mirror and the new signature file.
After you have finished the manual syncing (uploading,...) you tell the gallery that the mirror is now in sync again.

3. Checking whether the mirror is alive
We know that the mirror is in sync. It's always in sync but in the minutes when someone modifies the source. So we just have to know whether the mirror is alive.
A simple ping would do the job. That takes maybe 10 - 100 ms, pretty long.
Perhaps you know that the mirror is reliable and don't need to ping it every time.
You could add a link on the page "if the pictures don't appear, the mirror may be down. please inform us by clicking this link".

Another thing concerning mirrors and bandwidth balancing:
Perhaps you could define bandwidth maxima based on estimated bandwidth/request calculations per mirror + balance the bandwidth by defining a percentage of load for each mirror.
This is just an idea, I don't really need that, but maybe there will be professional customers who are interested in something like that.

Cheers - Andy

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Thu, 2003-12-25 00:27
valiant wrote:
1. Creating the mirror:
- In Gallery2 you click "create mirror signature file"
This generates a textfile for example album.dat.
In this textfile there is for each image of this album the filename/id and a timestamp of last modification on this image or its derivatives.
- You put all images/derivates of this album + the signature file on the mirror.
-> The admin is responsible that no-one alters the files on the mirror.

There are a variety of interesting issues with this. For starters, the last modification date of the derivative is significant, but it's not the ultimate arbiter of whether or not the derivative is up to date. A derivative can be attached to any source image in the system and that source can change independently of the derivative, so in order for us to know if the derivative is still valid we need to have a more complete dependency map. This will be doable, I think but it won't be trivial.

valiant wrote:
2. Syncing:
The only time you have to check whether the mirror is in sync for a specific album is when you modify the source images/derivatives.
If the source has been modified and there is a mirror for this album, Gallery2 will check the signature file on the mirror (public readable, no problem) and display what images have been modified, deleted and added.
It will also generate a new signature file and the admin is responsible to add/delete/replace modified files + replace the signature file on the mirror.
Perhaps, you could make it a 2 step work flow. After the modification of the source you get the diff to the mirror and the new signature file.
After you have finished the manual syncing (uploading,...) you tell the gallery that the mirror is now in sync again.

It doesn't make a lot of sense to mirror at the album level (see my above points) so the signature files will have to have a finer grained context. This is still doable. But the bigger problem is that the images and the index cannot be stored in a publically accessible way without leaking information that we don't want leaked. G2 does not reveal how many derivatives or their dependency maps to the world so the G2 mirror cannot reveal that either without reducing our security. This means that we have to have PHP running on the remote mirror (which raises the bar for mirrors in general) and the processing must happen on the mirror.

valiant wrote:
3. Checking whether the mirror is alive
We know that the mirror is in sync. It's always in sync but in the minutes when someone modifies the source. So we just have to know whether the mirror is alive.
A simple ping would do the job. That takes maybe 10 - 100 ms, pretty long.
Perhaps you know that the mirror is reliable and don't need to ping it every time.
You could add a link on the page "if the pictures don't appear, the mirror may be down. please inform us by clicking this link".

Sure, we can work out something like this without too much difficulty. If the mirror is running PHP then we could also have it can ping the main server and let it know its current status in an asynchronous fashion.

None of this addresses the big problem here which is figuring out how to
propagate security to the mirror. Until we solve that problem, I don't think we can realistically move forward. When the browser reqeuests an image from the mirror we must be able to tell whether or not the browser's session has the permission in G2 to view the image...

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Thu, 2003-12-25 00:57

ok, didn't get it that this is all about security and not only integrity.

I agree on that. If you want the same security on the mirror you need to authenticate the user / his permissions before serving the images. That requires a scripting language of our choice - php.
If you want that really only this user can get the data and no other, you could do something like SecureID (source and mirror have some kind of password-lists, based on current time and/or an unknown algorithm / a list, the source creates an id for each image-request and appends the id to the image url in the html. The browser requests the images from the mirror with appended ids, the mirror php script checks the id and sends back the image). That way you don't need a database on the mirror. And they don't need to talk to each other.

But it's pretty inefficient to validate each image request. A typical thumbnail page has 4*3=12 images. So the user waits the time the source g2 needs to process his request and create the html + 12* process time of the mirror - insane.
You could lower the bar and have the admin decide if he wants authentication on the mirror. Or do authentication only on albums where needed (not public viewable).
If the mirror needs php, perhaps he's got a DB too, or it could connect remotely to the source G2 db. So the mirror could run a complete G2.
Why not host G2 on the mirror then?
Mostly the mirror has no bandwidth limitation but no php or Safe Mode On or another restriction and that's why we can't host G2 there.
So the requirement of php for the mirror is IMO not an option.

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Thu, 2003-12-25 01:11

@mirror data integrity:
mirroring is done on album level!
the mirror is always up to date as it gets only obsolete when someone modified the source images (add, delete, alter the image itself) or changed the thumbnails.
so we do the comparison also for thumbnails as you said.
how do we compare? take the timestamp of the files from the source.

I don't get the problem, what dependency map? What do you mean by

Quote:
A derivative can be attached to any source image.

A derivative in G2 is a thumbnail of an image, right?
Does it make sense that a derivative can be attached to any source image?? It's a 1 (source image) to many (thumbnails in different sizes) relationship and not the other way around.

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Tue, 2003-12-30 17:42

@Security part 2:
Other ideas:

- Authenticate on apache level (use PAM, LDAP?): each image request in the HTML has the form of http://username:password@mirror_url.com/albums/image.jpg
-> no scripting language / db on mirror, but extensive use of apache authentication.

- Authentication per session, only step 3.a) takes a lot of time and resources:
1. Mirror receives image request from browser.
2. Mirror checks whether the session key is valid.
3.a) If session unknown: Mirror validates session key with source G2 and stores the valid session key locally in a text file or db.
4. If valid: Mirror sends image to browser.

- Offer (at least) two solutions in G2: A: For people who can afford / really need secured access to the images, B: for those who are happy with the Gallery 1 mirror solution.
@A:
- These people probably have access to the DNS entries and can resolve the address of the gallery to multiple IPs -> load balancing
- All above ideas apply for this group.
@B:
- They need only data integrity, no php/db on mirror.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Fri, 2004-01-02 22:26
valiant wrote:
I don't get the problem, what dependency map? What do you mean by
Quote:
A derivative can be attached to any source image.

A derivative in G2 is a thumbnail of an image, right?
Does it make sense that a derivative can be attached to any source image?? It's a 1 (source image) to many (thumbnails in different sizes) relationship and not the other way around.

It's useful for us to attach a derivative to any source image. Consider the scenario where you have Album A containing Album B which has a thumbnail from image B1 instead album B. If Album B's thumbnail size is 150px, but album A's thumbnail size is 200px, we'd want album A's derivative to be sourced directly from image B1 which isn't directly inside Album A. Or how about the scenario where you want to set the thumbnail for movie B2 from image B3. And there are probably other scenarios that we haven't thought of yet. At the moment, any item's thumbnail derivative can be sourced to any other item's source image (though we don't allow this in the UI yet).

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Fri, 2004-01-02 22:42

There are a lot of good ideas here. I've been striving to come up with one that lets us do minimal mirror validation (ie, something very lightweight), preserves security and only serves up valid images.

I agree that we can offer no- (or low-) security mirroring as an option to the admin and we can always fall back upon that. We can even say that mirroring is only available for images with security, etc etc. I'm not too worried about implementing that.

Security
Authentication per session doesn't solve the problem because you still have to know which images that session is allowed to see. You can't allow the user to see whatever images they want because they have a valid session (otherwise, I can look at the url used to get one image from the mirror and then tweak the url to get all the other images). We'd have to mirror all the item permissions (at least the view permission) for that session in order for it to work. There may be ways to make this work (like by pushing the permissions over to the mirror) but I can't think of any that are tedious.

What I really want is an approach that allows the mirror to decide whether the request is valid simply by looking at the url. I've been pondering a strategy like this:

The server has a private key and (if it knows the mirror is valid) on every image view it crafts a url to the mirrored image that has the following information in it:

  • The user's session id
  • The source or derivative id
  • The request timestamp
  • The client's IP address

This information is digitally signed by the server using a shared secret that is on both the server and mirror. The server will only create and sign this url if the client session has view permission.

The mirror is running a very simple PHP script that validates the signature to know that it was legally received from the server. Then it verifies the client's IP address to make sure that the url was not hijacked. If the request timestamp is current (say, within an hour) and the mirror has a copy of the source/derivative then it returns the image. Otherwise, the mirror makes a request to the server and gets the latest copy (perhaps it just sends a HEAD request and only gets the image if it has changed) of the image, stores it locally and then serves it up.

This solution requires PHP support on the mirror, but it does a lot for us. For starters, it means that the mirror is now a read-through cache of the data and will be kept up to date. We won't have to make inter-server validation requests for each album or image. We have full security (since all security is managed on the server side). If permissions change, the mirror will stop serving up the image after the mirror-URL's timestamp expires. This is also very portable since it will run anywhere that PHP does.

What do you think about that?[/]

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Sun, 2004-01-04 18:30

integrity:

db:
table mirror: mirror_id, address(url/ip), mirror_type, communication(G2 to mirror protocol, login)
table mirror_entity_map: entity_id, mirror_id, last_synced(timestamp)

mirror types/ideas: (summing-up our thoughts)

v.1. integrity only
_desc:
G1 like mirror, but integrity is only checked when manipulating albums items. -> fast
_procedure:
G2 checks for each entity in served html if there's a valid mirror (last_synced > last_relevant_manipulation)
if so, it creates image urls pointing to the mirror, that's it.
_features:
on album level, no security, full data integrity as long the mirror is online
and no1 changed the files on the mirror. a simple ping mirror would enhance reliability.
_mirror_requirements: webserver
_G2_requirements: the above db tables / classes

v.2. remote item
_desc:
variation of v.1. but the item is stored only remotely. if you want to manipulate the item,
G2 first needs to get the item, manipulate it and send the new version of the item+derviatives.
_procedure: same as v.1.
_features:
v.1. + web space and bandwidth can both be managed / balanced on multiple servers
_mirror_requirements: webserver
_G2_requirements: v.1. + remote item class etc.

security (incl. integrity):

v.3. v.1. + security
_desc:
integrity as in v.1. + your last security version (which is perfect, I like it :) )
_procedure:
as in your description
_features:
full integrity + security
_mirror_requirements: webserver + scripting language (php/perl)
_G2_requirements: v.1. ++

v.4. remote item + security analogous to v.2. and v.3.

edit by valiant, 2005/09/27: maybe we could manipulate on the remote host too. But then i slowly begin to wonder why not to use the remote host as the front-end anyway.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Mon, 2004-01-05 06:59

Thanks for the summary. I'm probably not going to work on it right away, but when we get going I'd like to do v.3 and then v.4 in that order. I think that a lot of people would really like the capability of managing all of their images entirely on a remote server. Administration would be slow, but this would be a very powerful feature!

 
adventure
adventure's picture

Joined: 2004-03-21
Posts: 59
Posted: Mon, 2004-04-26 21:29

Great stuff! To bad I didn't read it a month ago.

I made a quite nice (if i may say so) update of the mirror system for gallery v1.

It's based on couple of things, in short:

Check with one file, the mirror. What albums, are online. Time of sync. Server thumbs? etc.

On every hit, gallery will run some code. That checks if the mirrors needs to be checked. (define a interval, currently 2 min).

By the use of fsockopen you can define a time out for your connection. (currenctly 2 sec)

When showing a album use roundrobin to decide whitch (valid) mirror to choice.

Read more on : [url]http://gallery.menalto.com/index.php?name=PNphpBB2&file=viewtopic&t=15403 [/url]

When I have some time i will also included bandwith support. something like. 60 % request to server A en 40 % to server B.
(probly small hack)

You can define a lot of options. Like always checking.
Let the primairy site also host pictures etc.

Hope you like it.

Cheers,
Harry

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Mon, 2004-04-26 21:54

adventure, nice work :)
I'm sure bharat will be glad to see how you did it when programming the remoteItem for G2, it's always good to see how other people resolved the same problem.
Perhaps you could talk to hobbel and integrate your changes to the next release of G1.

 
bharat
bharat's picture

Joined: 2002-05-21
Posts: 7994
Posted: Wed, 2004-04-28 07:12

This looks excellent, Harry. It'll be a little while before we get around to doing this for G2, but when I do I'll be sure to review what you've done.

 
adventure
adventure's picture

Joined: 2004-03-21
Posts: 59
Posted: Wed, 2004-04-28 19:02

Thanks for the complements. :D

I don't know if it helps you much building a solution for gallery2. The framework is very different. (i think so after reading this thread).

A tip could be when checking a mirror and it appears to down. Don't try to check every time if it's up. When its down 5 times. It will probly be still down the next hour. After one hour reset the times you tried. (x5)

This will make the effect of a "permant broken mirror" less effective for the entire site.

I will talk with h0bbel about intergrating it with gallery1.
Should I use the form or e-mail to get to him?

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Wed, 2004-04-28 22:43
adventure wrote:
I will talk with h0bbel about intergrating it with gallery1.
Should I use the form or e-mail to get to him?

don't bother, he's open for any input :)

 
h0bbel
h0bbel's picture

Joined: 2002-07-28
Posts: 13451
Posted: Wed, 2004-04-28 23:25

valiant, hey!! :)

Well, it's not up to me. I wonder where you guys got that idea from! :)
I'm sure that Bharat has something he want's to say about it. Also, adventure, join us at #gallery on freenode.net (IRC) and you can discuss it with the developers directly. Btw, that invite is for you valiant too. :)