China the Beautiful Forum Index China the Beautiful
A forum for readers of chinapage.com
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How to display pinyin and tonal marks

 
Post new topic   This topic is locked: you cannot edit posts or make replies.    China the Beautiful Forum Index -> Chinese coding systems, Sound and other technical problems
View previous topic :: View next topic  
Author Message
Guest2
Guest





PostPosted: Thu Jul 01, 2004 2:57 am    Post subject: Pinyin Reply with quote

chinapage wrote:

You ask me to provide pinyin. This is a great idea. However, this is a big task, which must wait for some volunteer to implement. Will you do it? Do you know someone who might be interested in this? As for myself, I do not know pinyin, and must use a dictionary to look up each and every word. Sad

Ming


Ming (and Guest),

Just to let you know, there are currently several tools that can help you to semi-automatically add pinyin to a digital Chinese text. You can check out http://www.mandarintools.com and http://www.wenlin.com for some examples. Also, in general Wenlin is a great tool for helping you read Chinese texts even if you don't know all the characters.
Back to top
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Thu May 24, 2007 5:59 pm    Post subject: Pinyin Reply with quote

Hi,

I might be able to help with the adding pinyin problem...

What pages with chinese characters are the most important to add pinyin to?
Back to top
View user's profile Send private message
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Fri May 25, 2007 1:58 pm    Post subject: Reply with quote

Dear Guilinbog:


Thank you very much for your offer to help - I surely need it.

Perhaps you might start with the page

http://www.chinapage.com/quote/saying.html

Adding pinyin, I think, will be very useful.

If you like, you can also email to me at
forum@chinapage.com

which is only for forum business.

Ming
Back to top
View user's profile Send private message Visit poster's website
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Mon May 28, 2007 11:39 am    Post subject: email sent Reply with quote

Hi,

I sent an email to you about this with reference to a test page that I did.

I can send the email again if you didn't receive it..

I'll be waiting to hear from you...
Back to top
View user's profile Send private message
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Wed May 30, 2007 2:54 pm    Post subject: Test Reply with quote

Dear Ming,
this is just a test to demonstrate that it's quite easy to convert your pages' Big5 (or else) texts to pinyin, if - and only if - you'll be willing to install a unicode font displaying the pinyin diacritics! If not, adding the tonemarks (=numbers) by hand would be a quite cumbersome procedure Rolling Eyes

(Plz switch your browser to unicode in order to read the following)

gāoshānliúshuǐ
qǐrényōutiān
kèzhōuqiújiàn
zhāosānmùsì
zìxiāngmáodùn
lǎomǎshítú
lànyúchōngshù
yǎn'ěrdàolíng
sàiwēngshīmǎ
cǎomùjiēbīn


Regards

Alfred
Back to top
View user's profile Send private message Visit poster's website
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Wed May 30, 2007 9:08 pm    Post subject: not quite right Reply with quote

Hi,

I'm using IE 6 and viewing this page in univcode (utf - 8 )

I see the pinyin but I also see a few of those boxes amidst the text .. the kind of boxes you see when program can't read the characters correctly..


does anyone else have this problem?
Back to top
View user's profile Send private message
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Thu May 31, 2007 4:23 am    Post subject: Reply with quote

Guilinbog wrote:
I see the pinyin but I also see a few of those boxes amidst the text


I'm using Mac (Safari browser) and it displays correctly.
Would you tell us what are the diacritical tone marks that don't display (=are 'boxes') on your screen?
Back to top
View user's profile Send private message Visit poster's website
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Thu May 31, 2007 1:34 pm    Post subject: pinyin Reply with quote

ok.. there are 7 boxes total.. so.. in the following.. the number represents line number from top, and + represents the box thing.. I'll write the syllable of the line as is appears...

1. shu+
2. q+
6. l+o m+
8. y+n
9. m+
10. c+o

all other syllables appear to be normal..
Back to top
View user's profile Send private message
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Thu May 31, 2007 3:41 pm    Post subject: Reply with quote

Okay, so your IE UTF-8 seems to have a different encoding for the 3rd tone diacritic than my Mac browser/Wenlin for Mac has - weird Rolling Eyes

The pinyin text displays correctly with NS browser (Mac) and is readable with IE (Mac) also (although the 3rd-tone characters being replaced by a different font within the same text!)
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Thu May 31, 2007 4:54 pm    Post subject: Reply with quote

I just checked this using Microsoft IE7, which works without any
problems.

I normally use Firefox, which also dispaly all characters without
any problems.

May I suggest that you either upgrade to IE7 or switch to Firefox.

Although I no longer use IE6, my memory tells me that that
used to work for me also. Be sure to change the settings by
View | character encoding | unicode utf-8

I shall post some test for you soon.
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Thu May 31, 2007 6:07 pm    Post subject: Reply with quote

Dear Guilinbog and friends:

Thank you for taking time to code the web page.

I have posted it here.
http://www.chinapage.com/quote/5pinyin.html

As you said, this was done using Microsoft Word, which
saved the result as a html file.

After reviewing it, I found it not quite acceptable.
The file contains a great deal of ref to your web page.
Thus, if you delete your page, it would not work.
We really need a webpage which does not make any
references to external urls.

Alfred posted the pinyin of the words coded in utf-8.
I believe that among the various alternatives, utf-8
is prabably the best way to go at the present time.

Please take a look at this web page

http://www.chinapage.com/quote/utf-8.html

With unicode, we can have English, fanti big5, and jianti gb all
incoded and intermixed in one file.

It appears that some of the major web sites from China have
decided to adopt utf-8.

I am seriously considering adopt it also.

I welcome your comments.

Ming
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Thu May 31, 2007 6:47 pm    Post subject: Reply with quote

Dear Alfred:

I examined your text carefully. Although I am able to read them,
there seem to be some defects.

I found that there are extra blank characters embedded which
seems to cause problems. There should not be blanks.

Will you take a look please.

Ming
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Thu May 31, 2007 7:13 pm    Post subject: Reply with quote

Dear Alfred:
Upon further examination, I believe that your
original post contains errors.
Take for example the next-to-last line, the
correct code should be:

sāi wēng shī mǎ

The first word should be sāi

Ming
Back to top
View user's profile Send private message Visit poster's website
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Fri Jun 01, 2007 1:42 am    Post subject: hi Reply with quote

you are right.. the page i posted is somehow full of referalls to my page... I can't believe I didn't notice that.. of course it wont work Sad -- now I know what you meant in your email when you said "serious problems" Sad crash and burn

i used a combination of online tools and programs and dreamweaver.. the program that puts the pinyin over the words is the one that wrote in all the weird referal code.. other tools I used to change the code to unicode etc.. also.. i didnt' use microsoft word - were you refering to me?

the weird pinyin program i'm talking about is called DimSum.. i found it on a webpage that someone on these forums posted, called mandarintools.com

if you like I can try this again .. maybe if I put the various steps in a different order we can avoid the weird code this program puts in.. I would really like to find a way to do this rather quickly and automatically so that it is practical if you want a large number of pages to display pinyin...

----
... i could update my browser to IE 7 but i was wondering if you think that a lot of your visitors might be using IE 6..

but.. also I wonder if other people with IE 6 are even having the same problem.. is there anyone out there with windows xp and IE 6 who has tried looking at this? i have some of the same problems with the sample page: http://www.chinapage.com/quote/utf-8.html

for the record i am switching the IE encoding to utf - 8 when I try to look at these pages.. so that doesnt seem to be the problem...
Back to top
View user's profile Send private message
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Fri Jun 01, 2007 1:58 am    Post subject: hi again Reply with quote

well.. I'm looking at the source code for the pages in question and I notcied that there seems to be just pinyin with tonal marks as well as just straight chinese characters directly in the html code.. i always thought this was a big no-no that would cause a lot of problems across different browsers, platforms.. etc. etc..

so, what is the final objective? if pinyin with tonal markers is ok to appear directly in the html code that would make a big difference in what tools I might use to do this..
Back to top
View user's profile Send private message
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Fri Jun 01, 2007 5:21 am    Post subject: Reply with quote

Ming wrote:
Upon further examination, I believe that your
original post contains errors.


My dictionaries (software) are giving both first and fourth tone (didn't check my paper sources) Question

Alfred
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Fri Jun 01, 2007 7:53 am    Post subject: Reply with quote

Alfred wrote:
My dictionaries (software) are giving both first and fourth tone (didn't check my paper sources) Question


The problem is not with which tone to associate with the word.
It has to do with inserting an extra space within a word
(e.g. writing the 'tone' as 'to ne'). I am not sure how the error
crept in in this case. I assure that it was just a typo, and not
due to some automatic software step.

Ming
Back to top
View user's profile Send private message Visit poster's website
Guilinbog



Joined: 24 May 2007
Posts: 29

PostPosted: Fri Jun 01, 2007 11:11 pm    Post subject: Re: hi again Reply with quote

Guilinbog wrote:
well.. I'm looking at the source code for the pages in question and I notcied that there seems to be just pinyin with tonal marks as well as just straight chinese characters directly in the html code.. i always thought this was a big no-no that would cause a lot of problems across different browsers, platforms.. etc. etc..

so, what is the final objective? if pinyin with tonal markers is ok to appear directly in the html code that would make a big difference in what tools I might use to do this..



what I'm referring to when i say "pages in question" are the pages that arent' showing up correctly in my browser - including the pinyin as entered on this forum and also on the sample page:
http://www.chinapage.com/quote/utf-8.html

so again.. is it ok to have pinyin with tonal marks on directly in the html code for the pages that we are trying to put pinyin on? previously I was avoiding that...
Back to top
View user's profile Send private message
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Sat Jun 02, 2007 5:23 am    Post subject: Reply with quote

Ming wrote:
the problem is not with which tone to associate with the word.
It has to do with inserting an extra space within a word
(e.g. writing the 'tone' as 'to ne'). I am not sure how the error
crept in in this case. I assure that it was just a typo, and not
due to some automatic software step.


Weird:
1) It looks perfect on my screen (i.e. the chengyus are written in one pinyin word without any spaces!)
2) There actually are no typos (I didn't type it in)

Alfred

BTW, plz remove the porn ads within this thread, since, for some unknown reasons, I'm unable to do it myself here.
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Sun Jun 03, 2007 7:29 am    Post subject: Reply with quote

Dear Alfred:

Try the following:

Reset 'character encoding' to standard Western English (ISO-8859).
Use 'copy and paste' to copy what you posted here, and
paste it to another file using the basic text editor.

You will see the real, actual text you have posted.

When I examine this text, I can see the various errors.
By that I mean the code for one word has an extra blank
space in it. This is like coding the word 'space' as 'sp ace'.

As you say, it is weird because we do not know how these errors
came to be.

What is obvious is the fact that we do not yet have a good
program capable of automatically adding tone marks to
a web page.

Ming
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Sun Jun 03, 2007 12:46 pm    Post subject: Reply with quote

Dear Ming,

it seems that there is a big misunderstanding between us Rolling Eyes

1) I don't expect the pinyin text to show up properly without utf-8 enabled!
2) Copy pasting that 'garbage' text - under standard (ISO-8859) - again results in garbage in any 'basic' editor (like my Mac TextEdit).
3) Copy pasting that proper pinyin text - with utf-8 enabled on my browser - results in proper pinyin in Mac TextEdit (which still remains being readable pinyin text after saving and reopening the file. Notabene: the basic editor doesn't even have a feature to choose the encoding mode! I copy pasted 1) and 2) within the same document where they'd appear accordingly side by side as hodgepodge and flawless text respectively!)
3) Same effect, as stated earlier, when copying the pinyin into html-pages (Safari, Netscape and IE Mac, the latter displaying some special diacritics, i.e. 3rd tones, using charcters of a different font). Statement 3) only holds for utf-8 being enabled or this encoding being specified in the header of the html-page.

regards

Alfred
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Sun Jun 03, 2007 2:51 pm    Post subject: Reply with quote

Dear Alfred:

I should have confined myself to the actual issue, and not
try to save time and introduce other extreneous issues.

Let us start over again from the beginning. I will post using jpg
instead.

If I use utf code, then I can write these 4 Chinese words as follows:



The first line shows these 4 words with blanks inbetween words.
The second line shows these 4 words without blanks inbetween.

Both of these are proper and correct.

But if you look at the third line, you will note that the first word is
not coded corrected according to utf. This is the what you showed.
No matter what you see on your monitor, this is what is sent out
to the Internet.

After I receive this, I then use my software to interpret it.

This is what I see.



The first two lines are ok. Because (1) the codes received is correct,
and (2) it is interpreted properly.

The third line is not ok. Because (1) the codes received is not correct,
so that (2) it is impossible to interprete it properly by anyone. This
is why you see a 'rectangle' in place of the un-interpreted word.

Perhaps you can do the conversion again from the beginning. Just
do this one word. See what happens.

Ming
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Mon Jun 04, 2007 5:05 am    Post subject: Reply with quote

Dear Ming,

it is this what I'm getting:
Sàiwēngshīmǎ
(i.e. perfect pinyin on my browser - this time, I didn't alter the heading capital letter by hand!)

This is what I saw with my previous try:
sàiwēngshīmǎ
(perfect pinyin as well!)

You're right that there's [space] after the first couple codes when looked at using standard encoding, yet this doesn't result in scrambled pinyin under utf-8 mode on all my browsers or other editors!

Looking at this result under ISO Latin 1 gives the same sequence of characters (with [space] also), using Mac Lateinisch results in a different sequence (without [space]) etc.

BTW, although there's that space as well under standard mode, my sequence of characters displayed are somewhat different from yours shown above.

So, I only can repeat that my (unicode) pinyin displays perfectly on all my devices (except IE Mac - see above remarks).

Alfred

P.S. Now what's about this?
Sāi wēng shī mǎ
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Thu Jun 07, 2007 3:31 pm    Post subject: Reply with quote

alfred wrote:

Now what's about this?
Sāi wēng shī mǎ


This is fine. The error is gone.

Ming
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Thu Jun 07, 2007 3:45 pm    Post subject: Reply with quote

Ming wrote:
This is fine. The error is gone.


Still extremely weird: I did nothing but alter fourth to first tone "manually" Rolling Eyes

Alfred
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Sun Jun 10, 2007 12:09 pm    Post subject: Reply with quote

Dear Alfred:

I think I have the explanation, and it is very logical.

In pinyin, not every pinyin is valid in all 4 tones.
The word 'ma' can have ma1, ma2, ma3 and ma4.
But the word 'lai' only has 2 valid tones: la2 and la4.
The other 2 are invalid.
Thus if you spell it with an invalid tone, it is a spelling error.

When the reading software sees an incorrectly spelled word,
it simply displayed it as a box symbol.

Please experiment with some words to validate this theory.

Ming
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Mon Jun 11, 2007 4:53 am    Post subject: Reply with quote

Dear Ming,

your theory doesn't seem to be too plausible, given that all my Mac tools in fact don't have problems - as you are experiencing. Also, i cannot imagine that each pinyin syllable is encoded according to its "semantic possibility" (this wouldn't make sense!).

Look at the following:

dāi lái lǎi lài
lāi lái lǎi lài


Is your system able to display both, the green and the red pinyin word? (The green one is generated by my Chinese software, the red one altered manually, i.e. with the D replaced by L).

Alfred
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Mon Jun 11, 2007 11:25 am    Post subject: Reply with quote

alfred wrote:

Look at the following:

dāi lái lǎi lài
lāi lái lǎi lài

Is your system able to display both, the green and the red pinyin word?


No. This is a total failure! These are not recognized as Chinese code.



Ming


Last edited by chinapage on Mon Jun 11, 2007 1:33 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Mon Jun 11, 2007 3:35 pm    Post subject: Reply with quote

You wrote:
No. This is a total failure! These are not recognized as Chinese code.


Sorry, then this depends on "good old Microsoft technology" Wink

This is what (all my) Mac devices are showing:



Alfred
Back to top
View user's profile Send private message Visit poster's website
chinapage
Site Admin


Joined: 03 Jun 2002
Posts: 3548
Location: New Jersey, U.S.

PostPosted: Tue Jun 12, 2007 12:03 pm    Post subject: Reply with quote

Yesterday Steve Jobs announced a beta version of Apple
Safari brower version designed to run with Windows XP.

Currently, only 5% of the people use Safari, 15% use
Firefox with vast majority use IE.

I am thinking of installing Safari beta to see how it works.

Ming
Back to top
View user's profile Send private message Visit poster's website
Aolung



Joined: 10 Jul 2002
Posts: 1037

PostPosted: Tue Jun 12, 2007 3:16 pm    Post subject: Reply with quote

I installed Firefox several times already - and again removed it b/c IMVHO it didn't appear to be satifactory in certain features. Sorry, I don't remember these in the moment (I'm getting old Embarassed ) Yet, Firefox seems to be more compatible with certain Microsoft programs (like that MS world map tool).

Alfred
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   This topic is locked: you cannot edit posts or make replies.    China the Beautiful Forum Index -> Chinese coding systems, Sound and other technical problems All times are GMT - 3 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group