is gallery3 the wrong solution at scale?

hohum

Joined: 2014-02-12
Posts: 2
Posted: Wed, 2014-02-12 23:46

Hey there,

(Intro: I'm a busy devops and sysadmin guy working on knowing everything about everything - i.e. I'm time starved. I built a NAS and wrote the obligatory blog post on my blog akerneladay and did all the cool stuff - MariaDB, php-fpm, apache workers etc. I'm running CentOS 6.5 for Gallery3 and the latest rpm version and frankly - it's quick...)

For a small install set of photos gallery3 is wunderbar! But for my family's photos with 100k photos+ so far it's been very hard. I can't select 10 photos and rotate them, moving photos is unintuitive, it takes an eon to upload photos (gal3upload is great, but only one instance and only 1 at a time), I tried to import > 100k photos & it chugged through to 50% in 3 days, encountered a problem getting EXIF data on 1 photo & that was game, set & match. I tried to delete an album with about 20000 pictures in the site errors out. In the past when the delete operation has errored, you just have to wait for MariaDB to finish and the job does get done - eventually. It seems to go through one photo at a time:

-----
# for i in `seq 0 9`; do mysql -u<myuser> -p -e "show processlist" | grep -v Sleep; done
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 0 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3332 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 0 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3333 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3334 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3335 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3336 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3337 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3338 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3339 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3340 root localhost NULL Query 0 init show processlist 0.000
Id User Host db Command Time State Info Progress
2894 gallery3 localhost gallery3 Query 1 updating UPDATE `items`\nSET `left_ptr` = `left_ptr` - 2\nWHERE `left_ptr` > '1463' 0.000
3341 root localhost NULL Query 0 init show processlist 0.000
-----

and it's really slow.

I like gallery3 - it's really nice. But I can't use it like this. So before I throw more time at working it through, am I misusing the product? Should I be attempting to use it at scale with hundreds of thousands of photos like a Picasa replacement? Or is there something wrong with my setup?

Also I tried the gpt organize module but couldn't even get it to show up in the menus in firefox or chrome & I could get anything out of lmgtfy nor rtfm.

Feedback appreciated.

Cheers
Marc

 
floridave
floridave's picture

Joined: 2003-12-22
Posts: 27300
Posted: Thu, 2014-02-13 06:18
Quote:
MariaDB

Seems that has popped up before with other issues. Sorry no time right now to investigate as it is time for bed.

Dave
_____________________________________________
Blog & G2 || floridave - Gallery Team

 
hohum

Joined: 2014-02-12
Posts: 2
Posted: Thu, 2014-02-13 12:44
floridave wrote:
Quote:
MariaDB

Seems that has popped up before with other issues. Sorry no time right now to investigate as it is time for bed.

Dave

spun up a new vm with default everything, same deal. Thanks anyway Dave :)

 
Sad Eeyore

Joined: 2014-01-25
Posts: 3
Posted: Sun, 2014-02-16 16:32

I just made a thread with the exact same issue, what Gallery3 seems to be doing is updating a large swath of the 'items' table every time you insert a new image, causing the entire thing to be very, very slow. And as you add more images, the slower and slower adding new images becomes until it won't be usable anymore. I'll see if there is a workaround but I haven't found one yet.

 
mblythe

Joined: 2011-01-23
Posts: 3
Posted: Sun, 2014-07-20 23:08

Copy & paste from Sad Eeyore's other thread: http://galleryproject.org/node/112671

I was having a similar problem, so I investigated a bit. I found that it's due to the way Gallery3 stores the hierarchical album/photo data. It uses a scheme called "Modified Preorder Tree Traversal", or MPTT. This is a way of representing hierarchical data that makes lookups & reading the data very fast, but the consequence is that modfying the hierarchical data is computationally expensive. MPTT is described very well here: http://www.sitepoint.com/hierarchical-data-database/

I have to admit that the use of MPTT is a reasonable design decision. One tenet of programming is "optimize the common case"...in this case, reading the tree structure (e.g. displaying the web page to site visitors) is much more common than modifying it (e.g. uploading new photos), so the use of MPTT seems approriate.

I do agree, however, that the implementation of MPTT might be improved. For instance, instead of updating every existing item whever a single image is added, the MPTT data could be updated after the "current batch" of photos is added. This would introduce a small period of time where the MPTT data is inconsistent, so the impact of that would need to be evaluated. Also, "batch" updating like that may be more expensive than the current implementation when uploading a small number of photos.

Unfortunately, I don't think we'll ever see an official release of Gallery3 with any fix for this, since the developers have stepped down.

If I find some time, I may try to make some improvments myself. I've forked the Gallery3 code to my github here: https://github.com/mblythe86/gallery3/tree/3.0.x and that's where I'll post any changes I make.