Discussion:
google translate and non-ascii characters
Andreas Kemnade
2012-08-25 18:53:02 UTC
Permalink
Hi,

I tried to use translate.google.com with dillo and it does not work
with non-ascii characters. I do not know http by heart so that I know
what character set should be send in the post request but it seems
not to be what google expects. Words with non-ascii characters are not
translated and non-ascii characters are turned into questionmarks.

Greetings
Andreas Kemnade
corvid
2012-08-25 20:10:09 UTC
Permalink
Post by Andreas Kemnade
I tried to use translate.google.com with dillo and it does not work
with non-ascii characters. I do not know http by heart so that I know
what character set should be send in the post request but it seems
not to be what google expects. Words with non-ascii characters are not
translated and non-ascii characters are turned into questionmarks.
With our default http_user_agent setting, I see this problem as well.
With it set to a firefox string, translate.google.com seems to work fine.
Andreas Kemnade
2012-08-26 08:47:44 UTC
Permalink
Hi,

On Sat, 25 Aug 2012 20:10:09 +0000
Post by corvid
Post by Andreas Kemnade
I tried to use translate.google.com with dillo and it does not work
with non-ascii characters. I do not know http by heart so that I know
what character set should be send in the post request but it seems
not to be what google expects. Words with non-ascii characters are not
translated and non-ascii characters are turned into questionmarks.
With our default http_user_agent setting, I see this problem as well.
With it set to a firefox string, translate.google.com seems to work fine.
I captured a http request and recoded it to utf-8 and send it using netcat.
The result was that google correctly translated that without changing
user agent.
So does google do a workaround for firefox problems, or is google broken
and/or is dillo broken here?

Greetings
Andreas Kemnade
Andreas Kemnade
2012-08-26 15:05:54 UTC
Permalink
Hi,
Post by corvid
Post by Andreas Kemnade
I tried to use translate.google.com with dillo and it does not work
with non-ascii characters. I do not know http by heart so that I know
what character set should be send in the post request but it seems
not to be what google expects. Words with non-ascii characters are not
translated and non-ascii characters are turned into questionmarks.
With our default http_user_agent setting, I see this problem as well.
With it set to a firefox string, translate.google.com seems to work fine.
setting it to a firefox string gives
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
instead of
<meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">

but google seem to always want utf-8 from the html form. So it seems to
be a clear google bug.

Greetings
Andreas Kemnade

Loading...