字典用了壓縮方式,共約14M,連同轉檔程序一起打包。如果手邊
有原 EBS 的字典,可以直接用程序把 .tit 和 .nr 文件轉成標準字
典檔,不必下載14M的打包文件,以節省網絡流量。源碼如下:
/*-
* This program will combine .tit and .nr made by EBS into
* one c5 format dict file.
*
* $Id: fo2dict.c,v 1.1 2008/06/15 02:14:11 wxy Exp $
*/
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *f1, *f2;
char fn1[256], fn2[256];
int lf, cr, ul;
int c1, c2, len1, len2, max1, max2, total;
int i;
int next;
lf = (int) '\n';
cr = (int) '\r';
ul = (int) '_';
next = 1;
len1 = len2 = max1 = max2 = total = 0;
if (argc != 2) {
fprintf(stderr, "Usage: %s <file base>\n", argv[0]);
return 1;
}
sprintf(fn1, "%s.tit", argv[1]);
sprintf(fn2, "%s.nr", argv[1]);
f1 = fopen(fn1, "r");
if (NULL == f1) {
fprintf(stderr, "Open %s failure!\n", fn1);
return 1;
}
f2 = fopen(fn2, "r");
if (NULL == f2) {
fprintf(stderr, "Open %s failure!\n", fn2);
return 1;
}
while ((c1 = getc(f1)) != EOF) {
if (next) {
putc(lf, stdout);
for (i = 0; i < 5; i++) putc(ul, stdout);
putc(lf, stdout);
putc(lf, stdout);
next = 0;
}
if (0 == c1) {
total++;
putc(lf, stdout);
len1 = 0;
len2 = 0;
while ((c2 = getc(f2)) != EOF) {
if (0 == c2) break;
if (c2 != cr) putc(c2, stdout);
len2++;
}
max2 = len2 > max2 ? len2 : max2;
next = 1;
continue;
}
len1++;
max1 = len1 > max1 ? len1 : max1;
if (c2 != cr) putc(c1, stdout);
}
fclose(f1);
fclose(f2);
fprintf(stderr, "max len of key: %6d\n", max1);
fprintf(stderr, "max len of con: %6d\n", max2);
fprintf(stderr, "total key: %6d\n", total);
return 0;
}
README:
This project was created to transform Buddhadharma dictionaries
originally made by Electronic Buddhadharma Society (EBS) into
standard dictionary format of DICT protocol. Then dictionaries
can be access by a dictd server remotely or locally. The server
information are as follows:
dictd 1.10.2/rf on FreeBSD 4.11-RELEASE
On 127.0.0.1: up 07:45:04, 36 forks (4.6/hour)
Database Headwords Index Data Uncompressed
dfb 31592 484 kB 2995 kB 5617 kB
chen 5875 78 kB 250 kB 489 kB
szfs 8781 144 kB 1057 kB 2274 kB
fxcdtb 1045 16 kB 297 kB 646 kB
fymyj 1090 15 kB 237 kB 398 kB
wdhy 2050 41 kB 818 kB 1522 kB
english-chinese 9948 181 kB 1245 kB 3019 kB
fxcd 14692 272 kB 2552 kB 5623 kB
zen560 552 11 kB 222 kB 405 kB
theravada 1510 29 kB 196 kB 528 kB
syfy 571 8 kB 201 kB 330 kB
yzcj 1706 52 kB 296 kB 604 kB
ldgs 2126 105 kB 1682 kB 2505 kB
About 15M disk size needed in compressed. All .dict.dz and .index
are in data/ directory, and rfc2229.txt in doc/ directory. Later
more documentation will be written include Chinese translation.
Some scripts (dictd.conf, dictd.sh, and so on) are in examples/
directory. In src/ directory, fo2dict.c will convert .tit and .nr
files into standard dictionary format, and then can be processed
by `dictfmt' to make .dict and .index file. The usage is very
simple, compile it as:
% gcc -o fo2dict fo2dict.c
And suppose dfb.tit and dfb.nr are in the same directory, then
% ./fo2dict dfb
Then a C5 dictionary format will be output on stdout, and some
statistic information are shown in stderr. So if you want to save
it as file, just type:
% ./fo2dict dfb > dfb.c5.txt
To make .dict and .index, type
% dictfmt -c5 --allchars --locale zh_TW.Big5 dfb < dfb.c5.txt
INSTALL:
Please read README first. Before installing, make sure you have:
* dictd (better version 1.10.2 or later), if not, download it
http://sourceforge.net/projects/dict/
* Chinese (Big5) input method available, and a terminal that
can show Chinese characters such as rxvt, if not, please find
Chinese-Howto documentation on your Unix-like system.
Then unpack the tarball as:
% tar zxvf budict.tar.gz -C /tmp
% cd /tmp/budict/examples
% ./dictd.sh start
To make sure if the dictd is running, try this:
% ps -ax | grep dictd
If you can see it is in the process list, then try:
% dict -I
you will get:
dictd 1.10.2/rf on FreeBSD 4.11-RELEASE
On 127.0.0.1: up 04:19:21, 4 forks (0.9/hour)
Database Headwords Index Data Uncompressed
dfb 31592 484 kB 2995 kB 5617 kB
chen 5875 78 kB 250 kB 489 kB
szfs 8781 144 kB 1057 kB 2274 kB
fxcdtb 1045 16 kB 297 kB 646 kB
fymyj 1090 15 kB 237 kB 398 kB
wdhy 2050 41 kB 818 kB 1522 kB
english-chinese 9948 181 kB 1245 kB 3019 kB
fxcd 14692 272 kB 2552 kB 5623 kB
zen560 552 11 kB 222 kB 405 kB
theravada 1510 29 kB 196 kB 528 kB
syfy 571 8 kB 201 kB 330 kB
yzcj 1706 52 kB 296 kB 604 kB
ldgs 2126 105 kB 1682 kB 2505 kB
That means the dictd works correctly, then you can try this:
% dict amitabha
The description of the word will be shown. And sure you can type
in Traditional Chinese characters after `dict'.
N.B.: All the examples supposed that you unpack the tarball in
directory `/tmp/budict', and if you unpack it into another dir,
please change directorys in dictd.conf which announcing the
.dict.dz and .index, and in dictd.sh (-c /tmp/budict/examples).
BTW, the DICTD macro in dictd.sh should also been set to the
right full path in your system because not all of the system
have the same default path.
dictd.conf:
database dfb {
data "/tmp/budict/data/00dfb.dict.dz"
index "/tmp/budict/data/00dfb.index"
}
database chen {
data "/tmp/budict/data/01chen.dict.dz"
index "/tmp/budict/data/01chen.index"
}
database szfs {
data "/tmp/budict/data/02szfs.dict.dz"
index "/tmp/budict/data/02szfs.index"
}
database fxcdtb {
data "/tmp/budict/data/03fxcdtb.dict.dz"
index "/tmp/budict/data/03fxcdtb.index"
}
database fymyj {
data "/tmp/budict/data/04fymyj.dict.dz"
index "/tmp/budict/data/04fymyj.index"
}
database wdhy {
data "/tmp/budict/data/05wdhy.dict.dz"
index "/tmp/budict/data/05wdhy.index"
}
database english-chinese {
data "/tmp/budict/data/06english-chinese.dict.dz"
index "/tmp/budict/data/06english-chinese.index"
}
database fxcd {
data "/tmp/budict/data/07fxcd.dict.dz"
index "/tmp/budict/data/07fxcd.index"
}
database zen560 {
data "/tmp/budict/data/08560.dict.dz"
index "/tmp/budict/data/08560.index"
}
database theravada {
data "/tmp/budict/data/09theravada.dict.dz"
index "/tmp/budict/data/09theravada.index"
}
database syfy {
data "/tmp/budict/data/10syfy.dict.dz"
index "/tmp/budict/data/10syfy.index"
}
database yzcj {
data "/tmp/budict/data/11yzcj.dict.dz"
index "/tmp/budict/data/11yzcj.index"
}
database ldgs {
data "/tmp/budict/data/12ldgs.dict.dz"
index "/tmp/budict/data/12ldgs.index"
}
dictd.sh:
#!/bin/sh
DICTD="/usr/local/sbin/dictd"
case "$1" in
restart )
$0 stop
sleep 2
$0 start
;;
stop )
killall dictd
;;
start )
echo ' Starting dictd service...'
${DICTD} -c /tmp/budict/examples/dictd.conf --locale zh_TW.Big5 > /dev/null &
;;
esac
ACKNOWLEDGEMENT:
The original data is from Electronic Buddhadharma Society (EBS)
and has been transformed into standard dictionary format following
DICT protocol. The transformation was made automatically by
programme. The website of Electronic Buddhadharma Society is:
http://www.baus-ebs.org/
末學所用系統是 FreeBSD,dictd 版本是 1.10.2,轉換成 C5 格式後,可
以在第一個``_____''之上加上上面 ACKNOWLEDGEMENT 內容,將成為
字典的前言聲明。這個 C5 格式的文件是可編輯的字典檔,如內容有誤,可
以編輯修改,確認無誤後,再用 dictfmt 生成 .dict 和 .index 文件。