simba/Notes

145 lines
6.5 KiB
Plaintext
Raw Permalink Normal View History

2006-10-02 12:56:36 +02:00
Simple
Integrated
Multiplatform
Backup &
Archive
or
Simply
Accessible
Backup &
Archive
2006-11-20 23:19:39 +01:00
Performance:
workstation 2.4 Ghz P4, 512 MB RAM, SATA disk
over 100 Mbps Ethernet
to server 2.4 Ghz P4, 2 GB RAM, IDE/SCSI disk array
~800000 files / 992387 links (lots of hard links)
~20 GB / 28330052 kB (multiple storage of hard links)
time: 4h33m
~ 60 files / sec.
~ 1700 kB / sec
cpu usage was negligible.
bandwidth limited by network for large files.
top files/sec was much higher than average (> 300).
conclusion: bottleneck were seek times for DA.
(hardly optimizable except maybe by sorting inode numbers)
2006-11-28 17:01:10 +01:00
Same the next day: 3171 files transferred. 1h20min.
~200 files / sec.
2006-11-21 22:14:38 +01:00
check filenames with non-ascii characters.
2006-12-05 00:18:02 +01:00
Seems to work, except if there are non-utf-8 filenames on a utf fs
(but that can't really work).
2006-11-21 22:14:38 +01:00
2006-11-21 22:21:11 +01:00
check gid bits.
2006-11-30 14:55:53 +01:00
2006-12-05 00:18:02 +01:00
2006-11-30 14:55:53 +01:00
Equality checking doesn't work if user is unknown on backup server:
-r--r--r-- 1 4294967294 users 1449 2004-12-01 15:44 2006-11-27T23.22.42/yoyo.hjp.at/home/camel/wrk/perl-5.8.8/util.h
-r--r--r-- 1 4294967294 users 1449 2004-12-01 15:44 2006-11-28T10.18.30/yoyo.hjp.at/home/camel/wrk/perl-5.8.8/util.h
should be one file with two links, not two files.
2006-12-05 00:18:02 +01:00
Tape performance:
DDS4 (Vendor: HP Model: C5683A):
About 5-6 MB/s for /dev/nst0, @ 64 kB Blocksize. (larger bs makes no
difference). File was about 26 MB, 75% compressible with gzip.
2007-06-18 21:44:26 +02:00
2007-06-22 22:41:56 +02:00
exit if disk full
2007-11-15 21:18:55 +01:00
On my 800 MHz PIII, the CPU usage is rather high. Some profiling seems
to be necessary (or I should get a faster backup server :-)).
2008-06-20 08:19:18 +02:00
mkdir_p doesn't report the real reason of a failure:
mkdir_p('/backup/2008-06-20T08.10.56/zeno.hjp.at/.', 777)
mkdir_p('/backup/2008-06-20T08.10.56/zeno.hjp.at', 777)
mkdir_p('/backup/2008-06-20T08.10.56', 777)
failed: Read-only file system
cannot mkdir /backup/2008-06-20T08.10.56/zeno.hjp.at/.: No such file or directory at /usr/local/share/perl/5.8.8/Simba/CA.pm line 180, <GEN1> line 1.
The real reason is "Read-only file system" but after mkdir_p returns,
$! is "No such file or directory". (and anyway Simba::CA::backup2disk
shouldn't just die, but write a message to the log file first, but
that's a different problem)
Ideas:
* Check if File::Path behaves better.
* Die on error and let caller catch the error.
MySQL after crash:
-rw-rw---- 1 mysql mysql 10034184192 2010-06-07 09:14 instances.MYD
drwxr-xr-x 8 mysql mysql 4096 2010-06-07 10:20 ../
-rw-rw---- 1 mysql mysql 297649152 2010-06-07 10:20 files.MYI
-rw-rw---- 1 mysql mysql 619416576 2010-06-07 21:03 versions2.MYI
-rw-rw---- 1 mysql mysql 6144 2010-06-15 21:00 sessions.MYI
-rw-rw---- 1 mysql mysql 42952 2010-06-15 21:00 sessions.MYD
-rw-rw---- 1 mysql mysql 20630449152 2010-06-16 10:21 instances.MYI
mri:/var/lib/mysql/simba 10:21 :-) 108# df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/mri.wsr.ac.at-mysql
41284928 32332032 6856220 83% /var/lib/mysql
mri:/var/lib/mysql/simba 10:21 :-) 109# psg backup
root 19827 0.0 0.0 10100 1052 ? Ss Jun07 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 19407 0.0 0.0 10100 2136 ? Ss Jun08 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 3993 0.0 0.0 10100 2136 ? Ss Jun09 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 26324 0.0 0.0 10100 2552 ? Ss Jun10 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 16156 0.0 0.0 10100 2728 ? Ss Jun11 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 462 0.0 0.0 10100 2136 ? Ss Jun12 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 17584 0.0 0.0 10100 2748 ? Ss Jun13 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 1607 0.0 0.0 10100 2716 ? Ss Jun14 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 21566 0.0 0.0 10100 2692 ? Ss Jun15 0:00 | \_ /usr/bin/perl /usr/local/bin/backup
root 18685 0.0 0.0 4352 1188 pts/5 S+ 10:22 0:00 \_ /bin/sh /usr/bin/psg backup
Looks like it has been rebuilding that index for the last 9 days. That's
clearly inacceptable.
2010-09-01 13:00:13 +02:00
Bug: Duplicate detection doesn't seem to work sometimes. I have a lot of
versions with checksum=null and a single instance although they really
are hardlinked to older instances. Example (on mri):
+---------+-----------+-----------+------------+------------+------------+---------------------+----------------+-----------+------------------------------------------+-----------------+
| id | file_type | file_size | file_mtime | file_owner | file_group | file_acl | file_unix_bits | file_rdev | checksum | file_linktarget |
+---------+-----------+-----------+------------+------------+------------+---------------------+----------------+-----------+------------------------------------------+-----------------+
| 1220147 | f | 65559 | 1260229629 | hjp | betreuer | u::rw-,g::r--,o:r-- | | NULL | 41f445efd34cc11f3ec6eb924a5884a7fee0cf15 | NULL |
| 2492389 | f | 65559 | 1260229629 | hjp | betreuer | u::rw-,g::r--,o:r-- | | NULL | NULL | NULL |
| 2492394 | f | 65559 | 1260229629 | hjp | betreuer | u::rw-,g::r--,o:r-- | | NULL | NULL | NULL |
| 2801787 | f | 65559 | 1260229629 | hjp | betreuer | u::rw-,g::r--,o:r-- | | NULL | NULL | NULL |
| 3225686 | f | 65559 | 1260229629 | hjp | betreuer | u::rw-,g::r--,o:r-- | | NULL | NULL | NULL |
+---------+-----------+-----------+------------+------------+------------+---------------------+----------------+-----------+------------------------------------------+-----------------+
2010-09-02 13:40:57 +02:00
Find files with checksum is null:
select versions2.id, prefix, path from versions2, instances, files, sessions
where file_type = 'f' and checksum is null
and versions2.id=instances.version
and instances.file=files.id
and instances.session=sessions.id
limit 100000;
2014-01-11 19:44:23 +01:00
remove_session:
select v.id from instances i right outer join versions2 v on
i.version=v.id where i.id is null
is very slow. Do two independent queries and difference via judy?
In any case all the cleanup stuff needs to be outside of the loop.