migrating from G1 to G2 and Cyrillic characters

jnc

Joined: 2004-07-10
Posts: 3
Posted: Sat, 2004-07-10 02:12

I am migrating from G 1.4.1 to G2 (current) and try to import my G1 albums with cyrillic letters in comments, descriptions and title.

In import dialog, I see what is on import-1.png. I was able to fix these letters to normal ones by calling iconv() function in Gallery1DataParser::loadFile() and converting from KOI8-R (G1 default Russian codepage) to Windows-1251 (alternative Russian codepage). One more operation of converting $album->fields['title'] and $album->fields['description'] in Gallery1DataParser::loadAlbumFields() from Windows-1251 to UTF-8 was needed in order to get normal letters.

After that fix, it started to display stuff normally, as on import-2.png.

However, after import operation, the album name is still screwed up, as on import-3.png. Even if I revert my changes, almost none changes (see import-4.png).

So question is how to import the albums, photos and comments correctly? In which codepage should be these in order to satisfy Migrate Modules? Any help would be greatly appreciated.

(4-th attachment in next post).

AttachmentSize
import-3.png2.78 KB
 
jnc

Joined: 2004-07-10
Posts: 3
Posted: Sat, 2004-07-10 02:13

4-th screenshot

AttachmentSize
import-4.png2.25 KB
 
jmullan
jmullan's picture

Joined: 2002-07-28
Posts: 974
Posted: Sat, 2004-07-10 06:15
jnc wrote:
I am migrating from G 1.4.1 to G2 (current) and try to import my G1 albums with cyrillic letters in comments, descriptions and title.

Hey, terrific! I haven't gotten an example album with Cyrillic characters. If I get one, I will work on the character set conversion.

Please review this thread:
http://gallery.menalto.com/index.php?name=PNphpBB2&file=viewtopic&t=14703&highlight=migration

jnc wrote:
In import dialog, I see what is on import-1.png. I was able to fix these letters to normal ones

I hadn't thought about that much. It really doesn't matter how they show up in that dialog as long as it works.

Quote:
by calling iconv() function in Gallery1DataParser::loadFile()
...
Gallery1DataParser::loadAlbumFields() from Windows-1251 to UTF-8 was needed in order to get normal letters.

I don't think that is the right place to do a conversion to fix the character set, but if you are fixing it, you should convert it immediately to UTF-8.

Quote:
So question is how to import the albums, photos and comments correctly? In which codepage should be these in order to satisfy Migrate Modules? Any help would be greatly appreciated.

That's a good question. If I had sample data from you I could work on the character set conversions to fix it for you. Otherwise, I don't have any non-ASCII characters to import, so I can't test it.

All of the imported text fields should be converted from whatever character set you are using to UTF-8 when the BBCode conversion happens. (This still has to be coded.)

G2 probably will not support album paths that are not url-safe.

 
jnc

Joined: 2004-07-10
Posts: 3
Posted: Sat, 2004-07-10 12:19

Here is sample album crafted as per request: http://gallery.astrakhan.ru/test-album

Unfortunately, I run pretty old G1 (1.4.1) and it would be a little hard to upgrade, since I modified the skin to be more neat... So no meta tags will be inside page source, but server uses KOI8-R encoding and all visitors are forced to use Russian language in KOI8-R encoding.

Let me know if you need gzip or ftp access to the album (preferably by e-mail, since "Watch Topic" feature of this forum doesn't seem to work). Thanks.