I have been trying to use Google Transliterate API using the RESTful approach as its easy to do so through server side language (C# here).
So, I came across this URL format: http://www.google.com/transliterate/indic?tlqt=1&langpair=en|hi&text=bharat%2Cindia&tl_app=3 which returns the JSON in the format:
[
{
"ew" : "bharat",
"hws" : [
"भारत","भरत","भरात","भारात","बहरत",
]
},
{
"ew" : "india",
"hws" : [
"इंडिया","इन्डिया","इण्डिया","ईन्डिया","इनडिया",
]
},
]
I tried HttpWebRequest and HttpWebResponse to get the JSON but it returned values in Unicode on the web browser, such as:
[ { "ew" : "bharat", "hws" : [ "u092Du093Eu0930u0924","u092Du0930u0924","u092Du0930u093Eu0924","u092Du093Eu0930u093Eu0924","u092Cu0939u0930u0924", ] }, { "ew" : "india", "hws" : [ "u0907u0902u0921u093Fu092Fu093E","u0907u0928u094Du0921u093Fu092Fu093E","u0907u0923u094Du0921u093Fu092Fu093E","u0908u0928u094Du0921u093Fu092Fu093E","u0907u0928u0921u093Fu092Fu093E", ] }, ]
So, I applied this article and passed the JSON string via it, and it returned:
[ { "ew" : "bharat", "hws" : [ "भारत","भरत","भरात","भारात","बहरत", ] }, { "ew" : "india", "hws" : [ "इंडिया","इन्डिया","इण्डिया","ईन्डिया","इनडिया", ] }, ]
FIRST QUESTION: Am I doing it right so far? Because in the browser it DOES NOT show the last " ] ", however " ] " exists in the HTML source (not sure why that happened). Also, when I try to parse it, using (I might be wrong using this technique):
var jss = new JavaScriptSerializer(); var dict = jss.Deserialize<Dictionary<string, dynamic>>(the_JSON_string);
Its giving me error saying:
Invalid array passed in, extra trailing ','.
SECOND QUESTION: If I am doing right so far, can I get some help parsing the Hindi words? What approach should I take using preferably System.Web.Script.Serialization;. Eventually I want to grab the Hindi text for further processing.
Please help, thanks.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I would recommend Json.Net to parse json strings. Below code(with your sample string) works and you don’t need to do anything to unescape those characters. Json parsers will handle it for you.
string json = @"[ { ""ew"" : ""bharat"", ""hws"" : [ ""u092Du093Eu0930u0924"",""u092Du0930u0924"",""u092Du0930u093Eu0924"",""u092Du093Eu0930u093Eu0924"",""u092Cu0939u0930u0924"", ] }, { ""ew"" : ""india"", ""hws"" : [ ""u0907u0902u0921u093Fu092Fu093E"",""u0907u0928u094Du0921u093Fu092Fu093E"",""u0907u0923u094Du0921u093Fu092Fu093E"",""u0908u0928u094Du0921u093Fu092Fu093E"",""u0907u0928u0921u093Fu092Fu093E"", ] }, ]";
dynamic obj = JsonConvert.DeserializeObject(json);
MessageBox.Show(obj[0].hws[0].ToString());
Method 2
i think you can remove last comma like a below way
the_JSON_string = the_JSON_string.Remove(the_JSON_string.LastIndexOf(','));
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0