Scheduler installed as per this page: http://gallery.menalto.com/node/99806
Do I need to setup the crontab job as described? What am I setting up when I look at Maintenance in G3 and I can press the Schedule button next to each task (S3 sync in this case)? As I am setting a time there? Do I need to do it manually in crontab too?
Also, I can now run the S3 sync manually and it works!
ok, been hard at work and got another beta version to try out.
Download here. (aws_s3 v2 beta 4)
- Re-jigged the tasks so that they work correctly both manually run and scheduled.
- Made notices more informative as to what to do next.
I've also updated the scheduler module and started beta numbering that too.
Download here. (scheduler v2 beta 1)
- Extended task definition to allow tasks to only be scheduled or only be run there and then.
- Added extra "scheduled" state for task screen.
(i'm going to post this on this module's respective forum topic, and the codex page with docs eventually for other developers to use this). If you're updating, run this SQL in mysql/phpmyadmin/etc
DROP TABLE IF EXISTS `aws_s3_meta`;
DELETE FROM `modules` WHERE `name` = 'aws_s3';
DELETE FROM vars WHERE module_name = 'aws_s3';
# use these if you use the scheduler module
DROP TABLE IF EXISTS `schedules`;
DELETE FROM `modules` WHERE `name` = 'scheduler';
DELETE FROM vars WHERE module_name = 'scheduler';
then visit Admin -> Modules and re-enable the modules, re-configure aws_s3 and re-sync. hopefully this will solve a lot of issues now. apologies if you have a big gallery that you're running this s3 on. re-syncing from scratch is the only way to ensure the data's integrity.
danneh.org :: Gallery3
That's great, thank you Dan! Sync successful.
But is it right that in maintenance now, there is no 'schedule' button next to Sync with Amazon S3? I only have a run button...
correct. the "run" button on this runs a task which does some pre-processing and then schedules the task for you. a little confusing, probably. but that i found was the best, easiest way to implement it.
i'm gladd everything's going ok right now though! good news
so far no problems here either ..
one-off sync after upgrade done with no error..
server add works.
files magically synced to s3 afterwards either..
looks perfect so far! bugs to be reported asap ;)
regular git puller.
awesome beta 4 yea? *does a dance* lol
Thanks Dan for a great module for g3.
I seem to be having problems sync'ing my g3 installation with s3. It's probably a configuration issue on my end but I was hoping you could help. My running tasks always list the synchronization as stalled but the number of photo's uploaded increases usually stopping at around 500 of 1700. At this point, all available RAM is in use. For testing purposes, I have only selected to upload thumbnails but when I check the aws_s3_meta table I see "1" in the "fullsize_uploaded" column. Sorry for the brain dump, just wanted to give you all the information I have found.
First, thank Dan too much for this great module!
I installed and config-ed but the images are not show on my website!
It seems that the image links show on my website is different from what's inside my S3 bucket:
The image link that shows on my website:
The real link within my bucket:
Inside my bucket, there are only folder structures but not any other file.
Do I need to change anything within the URL String setting:
Thank you for helping me
my guess is the non-alphanumeric characters in the filename causing issues here, either causing the access denied messages when trying to download the file, or causing issues attempting to upload them in the first place. if you edit the photo in question and change the filename to something without  brackets in (just upper and lower case letters, numbers, spaces or _ or -) and see if the file gets re-uploaded by the module.
My website is at : pupushop.com/gallery3
Please have a look at it if you had time.
I don't think the problem is about characters , for example this one :
hm. how very odd. are you using the latest v2beta4 version, or the original v1?
hm. how very odd. are you using the latest v2beta4 version, or the original v1?
danneh.org :: Gallery3
Well I've just realized that my version is 1.So I disabled it and replaced with the newest version. But after activating, the gallery was messed up and the homepage didn't display anything.
So I decide to reinstall gallery3.
Then I activated AmazonS3 module, the system said that it needs the Scheduler module, so I install and activate it but "Dang...Something went wrong! ..." , I deactivated Scheduler.
Then I uploaded new pictures, went to Maintenance and run the Amazon S3 module. Now all the pictures are on AWS S3!
I'll try to find out more details on the problems but now AmazonS3 module works for me. Thank you, Dan!
I'm glad everything's going ok for you. If I don't get any more feedback in the next week or so relating to bugs in v2beta4, I'll stabilize it and push it into github.
Then I can start work on v3
Hope this is an Ok place to ask this... I currently have an issue whereby I want to migrate my G2 install into G3, using the import module. The import module 'copies' all my g2data files into the g3/var/ dir, duplicating everything, hence doubling the disk space (until I get a chance to delete the old g2data).
Do you think the version of S3sync where S3 can be the primary storage will solve this? I.e. somehow when I import G2 into G3, the data goes straight the S3?
in theory, if you're running v3 of this module in s3-only storage mode, then yes this would be possible. i think the flow of logic would go something like this:
- g2 import reads data from g2data and commences task
- g2 import imports each item into g3 one at a time
- g3 reads meta data and creates item in g3's database
- aws_s3 module hooks into when the item is created
- aws_s3 intercepts storage of file, and uploads the various versions of the item that g3 made and inserted into it's folder structure
- aws_s3 then deletes what g3 just stored
the last 3 items in the list would be standard behaviour in this mode, so whether an item was created with g2_import, server_add, browser upload, or any other method, aws_s3 would manage to hook into the created workflow, upload the files created and delete from local storage.
so to answer your question; yes, it probably will. knowing that this is also a desired usage of this mode helps expand my test cases for when i start writing this.
That sounds excellent! Might just wait till V3 in anticipation then!
Thanks for continuing devlopment on this much loved module.
Blog & G2 || floridave - Gallery Team
Any news on v3? I know the last post was just a few days ago, but I'm in a bit of a situation. I actually don't have enough space on my current hosting provider to even upgrade from g2 to g3, because the g2_importer does a copy instead of a move of the images. I want to use the aws_s3 module to make that copy just go up into Amazon, but right now I'm running v1 and it still uses local disk space.
I'd be **happy** to be a guinea pig and test a full import of 20+gb of photos into S3 with the v3 code!
hehe patience, my friend v2 hasn't been officially released out of beta yet, and work on v3 hasn't even started (though it has been planned a little more). i intend to release v2 this coming week as stable and push it into github as an official release, and commence work on v3 shortly after that. i'll probably have a crude version in a couple of weeks. i don't have a lot of free time on my hands to do stuff like this quickly. doesn't help i've got another project on the go right now as well! lol
however, since you're subscribed to the thread now (you posted), you'll be one of the first to know of updates for this module, when v3'll be ready for testing and for consumption by said guinea pigs ;)
Indeed, I'll be paying close attention. This plugin is long-overdue, and I appreciate you working on it!
Frankly I am surprised every time I see a web-based software that *doesn't* include S3 support for backend file storage..
well, for those watching closely but not wanting to use anything in beta, i've released a stable version of this module. version 2 is now available to download here. the pull request has been made to merge into gallery3-contrib github, so should be there, on gallerymodules.com, and in module-updates shortly
comments, feedback, bug reports on a postcard
EDIT: I see the error of my ways!
Trying to get the code with git... not entirely sure how to just pull this module...
From here: https://github.com/gallery/gallery3-contrib/tree/master/3.0/modules/aws_s3
I tried running
git clone https://github.com/gallery/gallery3-contrib/3.0/modules/aws_s3
But get this:
Initialized empty Git repository in /var/www/gallery/modules/aws_s3/.git/
Cannot get remote repository information.
Perhaps git-update-server-info needs to be run there?
I am guessing that it is quite likely I am just using the wrong command...could you tell me what I should be running?
Have pulled it in thanks...
Will this keep up to date with a pull when v3 comes out?
Have pulled it in thanks...
how? - did it work now the way you described above or did u do something else? (asking myself how to pull modules for a while yet..)
yep, it'll be replaced with the v3 code when it's released
The way I have done things is to clone the whole of gallery3-contrib, that way I can experiment with modules and themes, but keep things separate to my gallery install.
So I just cloned gallery-contrib outside my gallery install, then symlink the modules I want to try/use into my gallery/modules dir.
git clone git://github.com/gallery/gallery3-contrib.git
^will clone the whole gallery-contrib to a dir called gallery3-contrib in whatever dir you execute the command in (unless you specify a different dir at the end of that command).
Thereafter, performing 'git pull' inside the gallery3-contrib dir will 'pull' in updates. The pull this morning brought in Dan's aws_s3 module.
I then symlinked it to my gallery/modules to make it available:
ln -s gallery3-contrib/3.0/modules/aws_s3 gallery/modules/aws_s3
small update to the code; bug fix involving it ignoring the flags setting it's permission to upload thumbs, resizes and fullsizes.
fixed code here: http://www.danneh.org/files/aws_s3-2.zip
pull request to contrib git made, should be there shortly
Had a play around with this module last night and its fantastic, it opens all all sorts of possibilities. Note to anyone that was interested in giving it a go, Amazon S3 is free to setup and you are very unlikely to ever incur any data costs during setting it up (unless you test with uploading Gigs of data). I suggest you give it a go, you will never go back.
Im very much looking forward to v3 with 100% storage on S3. Thanks for your continuing work on this module
I have AWS_S3 version 2 up and running and have a problem with uploading large (>100mb) files to S3. I don’t know why the upload is failing as there is no information in any of the logs other than the fact the upload started. Is there a way to get more detail of how the S3 upload is progressing - my server is behind a painfully small upload pipe.
Since I only have a few of these large (video) files in my Gallery I was hoping to upload the files using the Amazon S3 tool and then manually update the gallery database to indicate they have been uploaded. However, I can’t seem to find the table where AWS stores the information regarding which files have been synced to S3. Where is this information located and how do I indicate the change in location?
Unfortunately, the only progress I can provide to you is how many files are completed. The S3 library doesn't provide the ability to know how far into the upload it is. And since it's a HTTP PUT (like POST). And as this isn't multi-threaded, there's no way of being able to "hack" to get that data either. So with that in mind, I'm only able to provide a "file started" and "file finished" message.
Now, if you want to upload manually, that's fine. What you need to do is upload the file into the expected folder heirarchy on S3. Once that's done, find the item ID and look it up in the aws_s3_meta table in the database. Set the *_uploaded fields to 1 and save. The S3 module should now believe that the items have been uploaded, even though the task might not be complete. You might want to go and delete the task in Admin -> Maintenance as well while you're at it.
At this point, if you've done everything right, S3 module should start redirecting web clients to the S3 copy instead of from local storage.
Hope this helps.
When running the sync task, does it re-upload EVERYTHING or does it just check for changes?
Currently it says it is uploading x of total, whether that means it is processing but not necessarily uploading every item I do not know.
Thank you very much for this module, it's working fantastic for me tied into my linode host.
I have a couple quick questions though:
1.) is there any way to make whats hosted on s3 private? Basically route traffic back though the host (have linode pull the photos from s3 and serve them)? I ask because I don't really want pictures of my kids available to the world.
2.) is there a better way?
3.) basically all I want to do is have s3 host all my images and have the VPS pull from a s3 mount point without making all my picture on s3 public.
Do I need to manually mount the s3 bucket and path gallery myself for this or am I missing something?
1) if your album's permissions are set to anything other than "everyone can view", then content is uploaded to s3 with the ACL_PRIVATE flag set. this means that there must be an authentication token/signature appended to the end of the url before viewing is allowed. permissions and tokens are granted and generated by the actual access to g3. if g3 is allowed to vend the image to the client, then a token is generated and lasts a specific number of seconds (Configurable in admin) before it expires (so the url can't be posted and accessed again and again later. this defaults to 60 seconds). there's no privacy issues in that respect, and that was a big thing i was asked to look at in v2 which got implemented very well.
2) is there a better way for what?
3) s3 storage only is not yet implemented, and is scheduled for the next major version (3.0, or 30).
the total number *is* the total number of items, but that's not necessarily the number it will eventually end up uploading. what it does is cross check the item against it's table in the db to find out what's uploaded and what isn't, and which versions need syncing. if it's already uploaded, it won't upload it again unless it detects a change (of which i have actually found an issue with and working on a v2.1 release, which will be released shortly).
Thank you for this module, its nice to have this opportunity.
There is one thing:
When I change the orientation of an image, then the image on the S3-Storage is not updated. Can I somehow force the module to update in this case?
that's a known issue. i'm actually in the process of developing a fix for this problem, among a couple of others as well as improving the sync task. should release 2.1 fairly soon
I added some photos to a sub album in a hidden album. I know then moved the sub album into a public area.
The thumbs and resizes get lost on the move. I'm not sure if it is aws_s3 just taking time to catch up or something.
Funny thing is if I move the album back they all work again.
So I tried running the sync task. The problem I have is that it keeps stalling and every time I press resume it seems to start again.
Why does it keep stalling?
yep, this is something i've noticed too. i was re-organising my production gallery a bit and when i shifted an album the images promptly "disappeared". i've actually got this resolved in 2.1. i'm also in the process of improving the re-sync task to compare what's on s3 against what's local, and also check it's own cache to see what actually needs updating, if anything.
as far as the stalled task goes, do you have the scheduler module installed and the crontab running?
Yes scheduler installed, crontab setup. Every time I resume the sync it gets a bit further.
How did you get around the missing images after organising?
well, the moving of images originally worked, but that's where the functionality seemed to end. if you shifted a single image from album to album, it would move. but if you shifted an entire album, the S3 library had no idea what to do, so it just didn't do anything. there's a whole bunch of code added to rectify this problem, and a bunch of other stuff. still working on it at present but should be able to get 2.1 out the door in the next couple of days
Sooner than I thought!
Announcing the release of v2.1 of this module. More info and download here: http://www.danneh.org/2011/01/amazon-s3-cdn-distribution-for-menalto-gallery-3-v2-1-released/
As usual, feedback on a postcard. Or a reply. Whichever suits you best
Will this filter thru git?
Thanks for your continued efforts!
no, it won't. i'm on the same wavelength as Serge D in that maintaining our code, and so many repository's is kinda hard. and the fact that git takes so long sometimes to get fixed code into the main repo. i've asked for all my code to be pulled out of the git repository, and i've updated all my codex pages to point to the relevant files on my website. (i can also keep track of # of downloads from there - i don't get that with git).
That's alright Dan, as long as I know I can keep up to date.
OK got the latest code.
I still have problems performing a sync.
To get around the problem I had moving albums, I put the album back in its original location, as it worked there, then turned off S3, moved the album to its new location, that then worked as it was all local. I then turn S3 back on and perform a sync with everything in place.
But I am still having the problem where the sync stalls... It goes for minutes then stalls and I have to hit the resume button, whereby it seems to start again, get a bit further than last time, then stall again.
Also there is a very long list of (what I now believe to be obsolete) scheduled tasks. Can I get rid of them somehow? They are for a date in the past so nothing is happening with them.
the fact that the task is "stalled" doesn't necessarily mean it has. g3's admin area marks tasks as stalled if they haven't updated in (quite a short space of) time. if you're on a slow uplink and it's uploading some hefty sized images, it's right that it won't update the status of this task until the upload is done. resuming the task only seemingly gets it running *again*, and *again*, (until you got maybe 3 or 4 going), conflicting with each other, and making the sync process even longer. i've tested the sync task over and over again, ignoring the "stalled" in the task list, and nothing seems to be awry there.
you can "remove all finished" tasks, there's no reason for them to be there once they're done.
Ah, ok I will try leaving it alone until tomorrow then
The list of tasks are not completed tasks, but scheduled tasks. Like when you press run to sync S3, it adds a 'one off' scheduled task. This is the list I mean... it seems to be from when I added an album of images and all the tasks to upload the images got scheduled but then for what ever reason didn't happen, and now they are old tasks I don't need any more.
I think they would automatically go if that scheduled task was recognised to be completed. But seeing as those tasks won't happen now, they won't go away (I deleted the images).
hm. interesting. you deleted the images before the task got a chance to complete/start? i don't think i've ever run that through my test cases, and thinking about how it works inside, it makes sense that it wouldn't work. you may have to go into the database and empty out the schedules and tasks table and set a re-sync going again.
i'll add that to my test cases and get a fix going for it!
Righty, I left it alone after last night, last time I looked at it last night it said it had stalled at about 22:30.
I have just come back to it at about 12:00 next day (14+ hrs later) and it is still stalled at the same time.
Viewing a new album I added last night, the images are all not available. If I toggle S3 off, then they are available.
Also could you explain where I can empty the list of scheduled tasks from please?
I am about to add a new album, with S3 enabled to see what happens... as this may all still be to do with the initial re-organisation. Although it would be nice not to have to delete, re-add 200+ photos, and re-tag! If it can be sorted with the ones in place, better.
I know how to do the 'dirty' rebuild if that would help.
also getting this error when trying to run Extract_exif task:
Extract Exif data Failed The item_thumb_hash property does not exist in the Aws_S3_Meta_Model class