| View previous topic :: View next topic |
| Author |
Message |
Guest2 Guest
|
Posted: Thu Jul 01, 2004 2:57 am Post subject: Pinyin |
|
|
| chinapage wrote: |
You ask me to provide pinyin. This is a great idea. However, this is a big task, which must wait for some volunteer to implement. Will you do it? Do you know someone who might be interested in this? As for myself, I do not know pinyin, and must use a dictionary to look up each and every word.
Ming |
Ming (and Guest),
Just to let you know, there are currently several tools that can help you to semi-automatically add pinyin to a digital Chinese text. You can check out http://www.mandarintools.com and http://www.wenlin.com for some examples. Also, in general Wenlin is a great tool for helping you read Chinese texts even if you don't know all the characters. |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Thu May 24, 2007 5:59 pm Post subject: Pinyin |
|
|
Hi,
I might be able to help with the adding pinyin problem...
What pages with chinese characters are the most important to add pinyin to? |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Fri May 25, 2007 1:58 pm Post subject: |
|
|
Dear Guilinbog:
Thank you very much for your offer to help - I surely need it.
Perhaps you might start with the page
http://www.chinapage.com/quote/saying.html
Adding pinyin, I think, will be very useful.
If you like, you can also email to me at
forum@chinapage.com
which is only for forum business.
Ming |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Mon May 28, 2007 11:39 am Post subject: email sent |
|
|
Hi,
I sent an email to you about this with reference to a test page that I did.
I can send the email again if you didn't receive it..
I'll be waiting to hear from you... |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Wed May 30, 2007 2:54 pm Post subject: Test |
|
|
Dear Ming,
this is just a test to demonstrate that it's quite easy to convert your pages' Big5 (or else) texts to pinyin, if - and only if - you'll be willing to install a unicode font displaying the pinyin diacritics! If not, adding the tonemarks (=numbers) by hand would be a quite cumbersome procedure
(Plz switch your browser to unicode in order to read the following)
gāoshānliúshuǐ
qǐrényōutiān
kèzhōuqiújiàn
zhāosānmùsì
zìxiāngmáodùn
lǎomǎshítú
lànyúchōngshù
yǎn'ěrdàolíng
sàiwēngshīmǎ
cǎomùjiēbīn
Regards
Alfred |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Wed May 30, 2007 9:08 pm Post subject: not quite right |
|
|
Hi,
I'm using IE 6 and viewing this page in univcode (utf - 8 )
I see the pinyin but I also see a few of those boxes amidst the text .. the kind of boxes you see when program can't read the characters correctly..
does anyone else have this problem? |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Thu May 31, 2007 4:23 am Post subject: |
|
|
| Guilinbog wrote: | | I see the pinyin but I also see a few of those boxes amidst the text |
I'm using Mac (Safari browser) and it displays correctly.
Would you tell us what are the diacritical tone marks that don't display (=are 'boxes') on your screen? |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Thu May 31, 2007 1:34 pm Post subject: pinyin |
|
|
ok.. there are 7 boxes total.. so.. in the following.. the number represents line number from top, and + represents the box thing.. I'll write the syllable of the line as is appears...
1. shu+
2. q+
6. l+o m+
8. y+n
9. m+
10. c+o
all other syllables appear to be normal.. |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Thu May 31, 2007 3:41 pm Post subject: |
|
|
Okay, so your IE UTF-8 seems to have a different encoding for the 3rd tone diacritic than my Mac browser/Wenlin for Mac has - weird
The pinyin text displays correctly with NS browser (Mac) and is readable with IE (Mac) also (although the 3rd-tone characters being replaced by a different font within the same text!) |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Thu May 31, 2007 4:54 pm Post subject: |
|
|
I just checked this using Microsoft IE7, which works without any
problems.
I normally use Firefox, which also dispaly all characters without
any problems.
May I suggest that you either upgrade to IE7 or switch to Firefox.
Although I no longer use IE6, my memory tells me that that
used to work for me also. Be sure to change the settings by
View | character encoding | unicode utf-8
I shall post some test for you soon. |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Thu May 31, 2007 6:07 pm Post subject: |
|
|
Dear Guilinbog and friends:
Thank you for taking time to code the web page.
I have posted it here.
http://www.chinapage.com/quote/5pinyin.html
As you said, this was done using Microsoft Word, which
saved the result as a html file.
After reviewing it, I found it not quite acceptable.
The file contains a great deal of ref to your web page.
Thus, if you delete your page, it would not work.
We really need a webpage which does not make any
references to external urls.
Alfred posted the pinyin of the words coded in utf-8.
I believe that among the various alternatives, utf-8
is prabably the best way to go at the present time.
Please take a look at this web page
http://www.chinapage.com/quote/utf-8.html
With unicode, we can have English, fanti big5, and jianti gb all
incoded and intermixed in one file.
It appears that some of the major web sites from China have
decided to adopt utf-8.
I am seriously considering adopt it also.
I welcome your comments.
Ming |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Thu May 31, 2007 6:47 pm Post subject: |
|
|
Dear Alfred:
I examined your text carefully. Although I am able to read them,
there seem to be some defects.
I found that there are extra blank characters embedded which
seems to cause problems. There should not be blanks.
Will you take a look please.
Ming |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Thu May 31, 2007 7:13 pm Post subject: |
|
|
Dear Alfred:
Upon further examination, I believe that your
original post contains errors.
Take for example the next-to-last line, the
correct code should be:
sāi wēng shī mǎ
The first word should be sāi
Ming |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Fri Jun 01, 2007 1:42 am Post subject: hi |
|
|
you are right.. the page i posted is somehow full of referalls to my page... I can't believe I didn't notice that.. of course it wont work -- now I know what you meant in your email when you said "serious problems" crash and burn
i used a combination of online tools and programs and dreamweaver.. the program that puts the pinyin over the words is the one that wrote in all the weird referal code.. other tools I used to change the code to unicode etc.. also.. i didnt' use microsoft word - were you refering to me?
the weird pinyin program i'm talking about is called DimSum.. i found it on a webpage that someone on these forums posted, called mandarintools.com
if you like I can try this again .. maybe if I put the various steps in a different order we can avoid the weird code this program puts in.. I would really like to find a way to do this rather quickly and automatically so that it is practical if you want a large number of pages to display pinyin...
----
... i could update my browser to IE 7 but i was wondering if you think that a lot of your visitors might be using IE 6..
but.. also I wonder if other people with IE 6 are even having the same problem.. is there anyone out there with windows xp and IE 6 who has tried looking at this? i have some of the same problems with the sample page: http://www.chinapage.com/quote/utf-8.html
for the record i am switching the IE encoding to utf - 8 when I try to look at these pages.. so that doesnt seem to be the problem... |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Fri Jun 01, 2007 1:58 am Post subject: hi again |
|
|
well.. I'm looking at the source code for the pages in question and I notcied that there seems to be just pinyin with tonal marks as well as just straight chinese characters directly in the html code.. i always thought this was a big no-no that would cause a lot of problems across different browsers, platforms.. etc. etc..
so, what is the final objective? if pinyin with tonal markers is ok to appear directly in the html code that would make a big difference in what tools I might use to do this.. |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Fri Jun 01, 2007 5:21 am Post subject: |
|
|
| Ming wrote: | Upon further examination, I believe that your
original post contains errors. |
My dictionaries (software) are giving both first and fourth tone (didn't check my paper sources)
Alfred |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Fri Jun 01, 2007 7:53 am Post subject: |
|
|
| Alfred wrote: | | My dictionaries (software) are giving both first and fourth tone (didn't check my paper sources) Question |
The problem is not with which tone to associate with the word.
It has to do with inserting an extra space within a word
(e.g. writing the 'tone' as 'to ne'). I am not sure how the error
crept in in this case. I assure that it was just a typo, and not
due to some automatic software step.
Ming |
|
| Back to top |
|
 |
Guilinbog
Joined: 24 May 2007 Posts: 29
|
Posted: Fri Jun 01, 2007 11:11 pm Post subject: Re: hi again |
|
|
| Guilinbog wrote: | well.. I'm looking at the source code for the pages in question and I notcied that there seems to be just pinyin with tonal marks as well as just straight chinese characters directly in the html code.. i always thought this was a big no-no that would cause a lot of problems across different browsers, platforms.. etc. etc..
so, what is the final objective? if pinyin with tonal markers is ok to appear directly in the html code that would make a big difference in what tools I might use to do this.. |
what I'm referring to when i say "pages in question" are the pages that arent' showing up correctly in my browser - including the pinyin as entered on this forum and also on the sample page:
http://www.chinapage.com/quote/utf-8.html
so again.. is it ok to have pinyin with tonal marks on directly in the html code for the pages that we are trying to put pinyin on? previously I was avoiding that... |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Sat Jun 02, 2007 5:23 am Post subject: |
|
|
| Ming wrote: | the problem is not with which tone to associate with the word.
It has to do with inserting an extra space within a word
(e.g. writing the 'tone' as 'to ne'). I am not sure how the error
crept in in this case. I assure that it was just a typo, and not
due to some automatic software step.
|
Weird:
1) It looks perfect on my screen (i.e. the chengyus are written in one pinyin word without any spaces!)
2) There actually are no typos (I didn't type it in)
Alfred
BTW, plz remove the porn ads within this thread, since, for some unknown reasons, I'm unable to do it myself here. |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Sun Jun 03, 2007 7:29 am Post subject: |
|
|
Dear Alfred:
Try the following:
Reset 'character encoding' to standard Western English (ISO-8859).
Use 'copy and paste' to copy what you posted here, and
paste it to another file using the basic text editor.
You will see the real, actual text you have posted.
When I examine this text, I can see the various errors.
By that I mean the code for one word has an extra blank
space in it. This is like coding the word 'space' as 'sp ace'.
As you say, it is weird because we do not know how these errors
came to be.
What is obvious is the fact that we do not yet have a good
program capable of automatically adding tone marks to
a web page.
Ming |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Sun Jun 03, 2007 12:46 pm Post subject: |
|
|
Dear Ming,
it seems that there is a big misunderstanding between us
1) I don't expect the pinyin text to show up properly without utf-8 enabled!
2) Copy pasting that 'garbage' text - under standard (ISO-8859) - again results in garbage in any 'basic' editor (like my Mac TextEdit).
3) Copy pasting that proper pinyin text - with utf-8 enabled on my browser - results in proper pinyin in Mac TextEdit (which still remains being readable pinyin text after saving and reopening the file. Notabene: the basic editor doesn't even have a feature to choose the encoding mode! I copy pasted 1) and 2) within the same document where they'd appear accordingly side by side as hodgepodge and flawless text respectively!)
3) Same effect, as stated earlier, when copying the pinyin into html-pages (Safari, Netscape and IE Mac, the latter displaying some special diacritics, i.e. 3rd tones, using charcters of a different font). Statement 3) only holds for utf-8 being enabled or this encoding being specified in the header of the html-page.
regards
Alfred |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Sun Jun 03, 2007 2:51 pm Post subject: |
|
|
Dear Alfred:
I should have confined myself to the actual issue, and not
try to save time and introduce other extreneous issues.
Let us start over again from the beginning. I will post using jpg
instead.
If I use utf code, then I can write these 4 Chinese words as follows:
The first line shows these 4 words with blanks inbetween words.
The second line shows these 4 words without blanks inbetween.
Both of these are proper and correct.
But if you look at the third line, you will note that the first word is
not coded corrected according to utf. This is the what you showed.
No matter what you see on your monitor, this is what is sent out
to the Internet.
After I receive this, I then use my software to interpret it.
This is what I see.
The first two lines are ok. Because (1) the codes received is correct,
and (2) it is interpreted properly.
The third line is not ok. Because (1) the codes received is not correct,
so that (2) it is impossible to interprete it properly by anyone. This
is why you see a 'rectangle' in place of the un-interpreted word.
Perhaps you can do the conversion again from the beginning. Just
do this one word. See what happens.
Ming |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Mon Jun 04, 2007 5:05 am Post subject: |
|
|
Dear Ming,
it is this what I'm getting:
Sàiwēngshīmǎ
(i.e. perfect pinyin on my browser - this time, I didn't alter the heading capital letter by hand!)
This is what I saw with my previous try:
sàiwēngshīmǎ
(perfect pinyin as well!)
You're right that there's [space] after the first couple codes when looked at using standard encoding, yet this doesn't result in scrambled pinyin under utf-8 mode on all my browsers or other editors!
Looking at this result under ISO Latin 1 gives the same sequence of characters (with [space] also), using Mac Lateinisch results in a different sequence (without [space]) etc.
BTW, although there's that space as well under standard mode, my sequence of characters displayed are somewhat different from yours shown above.
So, I only can repeat that my (unicode) pinyin displays perfectly on all my devices (except IE Mac - see above remarks).
Alfred
P.S. Now what's about this?
Sāi wēng shī mǎ |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Thu Jun 07, 2007 3:31 pm Post subject: |
|
|
| alfred wrote: |
Now what's about this?
Sāi wēng shī mǎ |
This is fine. The error is gone.
Ming |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Thu Jun 07, 2007 3:45 pm Post subject: |
|
|
| Ming wrote: | | This is fine. The error is gone. |
Still extremely weird: I did nothing but alter fourth to first tone "manually"
Alfred |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Sun Jun 10, 2007 12:09 pm Post subject: |
|
|
Dear Alfred:
I think I have the explanation, and it is very logical.
In pinyin, not every pinyin is valid in all 4 tones.
The word 'ma' can have ma1, ma2, ma3 and ma4.
But the word 'lai' only has 2 valid tones: la2 and la4.
The other 2 are invalid.
Thus if you spell it with an invalid tone, it is a spelling error.
When the reading software sees an incorrectly spelled word,
it simply displayed it as a box symbol.
Please experiment with some words to validate this theory.
Ming |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Mon Jun 11, 2007 4:53 am Post subject: |
|
|
Dear Ming,
your theory doesn't seem to be too plausible, given that all my Mac tools in fact don't have problems - as you are experiencing. Also, i cannot imagine that each pinyin syllable is encoded according to its "semantic possibility" (this wouldn't make sense!).
Look at the following:
dāi lái lǎi lài
lāi lái lǎi lài
Is your system able to display both, the green and the red pinyin word? (The green one is generated by my Chinese software, the red one altered manually, i.e. with the D replaced by L).
Alfred |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Mon Jun 11, 2007 11:25 am Post subject: |
|
|
| alfred wrote: |
Look at the following:
dāi lái lǎi lài
lāi lái lǎi lài
Is your system able to display both, the green and the red pinyin word? |
No. This is a total failure! These are not recognized as Chinese code.
Ming
Last edited by chinapage on Mon Jun 11, 2007 1:33 pm; edited 1 time in total |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Mon Jun 11, 2007 3:35 pm Post subject: |
|
|
| You wrote: | | No. This is a total failure! These are not recognized as Chinese code. |
Sorry, then this depends on "good old Microsoft technology"
This is what (all my) Mac devices are showing:
Alfred |
|
| Back to top |
|
 |
chinapage Site Admin

Joined: 03 Jun 2002 Posts: 3548 Location: New Jersey, U.S.
|
Posted: Tue Jun 12, 2007 12:03 pm Post subject: |
|
|
Yesterday Steve Jobs announced a beta version of Apple
Safari brower version designed to run with Windows XP.
Currently, only 5% of the people use Safari, 15% use
Firefox with vast majority use IE.
I am thinking of installing Safari beta to see how it works.
Ming |
|
| Back to top |
|
 |
Aolung

Joined: 10 Jul 2002 Posts: 1037
|
Posted: Tue Jun 12, 2007 3:16 pm Post subject: |
|
|
I installed Firefox several times already - and again removed it b/c IMVHO it didn't appear to be satifactory in certain features. Sorry, I don't remember these in the moment (I'm getting old ) Yet, Firefox seems to be more compatible with certain Microsoft programs (like that MS world map tool).
Alfred |
|
| Back to top |
|
 |
|