Related
WMSpellChecker
I have been programming for Windows Mobile for a few months by now, mainly using Basic4ppc which is an excellent developing-software, at a low cost, for producing great Windows Mobile-applications (but also Desktop-applications). Basic4ppc is based upon the .Net Framework.
One of the strenght of Basic4ppc is that it has great support and not only from the developer but also from other users who are supplying Basic4ppc with a lot of extra featues through external libraries. I got interested in writing a library myself and this is how the WMSpellChecker-library was born. However, from the very beginning, my idea was that the library should be compatible not only with Basic4ppc but also with Windows Mobile-applications developed in Visual Studio and SharpDevelop using VB.NET and C#.
I have seen commercial solutions for spellchecking but since I wanted to learn writing a library, I thought this would be a nice thing to give "for free" to fellow developers.
Well, let me get back to the WMSpellChecker-library:
Basically, a spell checker customarily consists of two parts:
1) A set of routines for scanning text and extracting words, and
2) An algorithm for comparing the extracted words against a known list of correctly spelled words (i.e., the dictionary).
However, what mentioned above is only a "half" spell checker since these days spell checkers also suggest replacements/corrections for misspelled words (among other things such as synonyms and grammar-hints). Said suggestions can be proposed by a spellchecking-engine based upon various techniques:
- phonetic algorithms such as "Soundex" among others.
- word lists containing common misspelled words and letters commonly inverted
- functions called "Near Miss Strategy" and introduced by one of the first spell-checkers on the market, namely Ispell for UNIX and with its roots dating back to 1971.
- algorithms like "edit distance" which measures the amount of difference between two sequences. A famous one is the "Levenshtein distance".
- and other techniques
The "techniques" mentioned above have all been implemented in the library.
I am aware of the fact that (at least) WM6 already offers spelling-suggestions and a spell checker if PocketWord (Office) has been installed but still I liked this idea so I decided to make a library. In any case, as far as I know, only the dictionary corresponding to the language of WM6 is being installed so if you want to spell check words in other languages you cannot do so.
The way it works.....
First of all, apart from referencing the library itself, you need to add two objects to your application, namely DICTIONARY and COMPUTEDETECTION.
Then you need to load the dictionary-files by using "LoadDict". Currently they consist of four separate files. However, I may change this in a future release. The dictionary-files must be located in the application-directory although you can create sub-folders. This first release only supports English and the dictionaries distributed with the library must not be tempered with. Next release will bring support for other languages and will also include a separate program for handling dictionaries.
Once the dictionaries have been loaded, you can start the spellchecking by calling the library using "ComputeDetection" which passes on your textbox-control to the library. In case there are words that are not present in the dictionary, then a set of suggestions will be returned to the calling application and at the same time the word which was not found will be shown in the textbox in capital-letters. The suggestions produced by the library can be obtained using "ReturnSuggestions" which returns a string-array.
Once you have shown the suggestions returned by the library, you can let your user in your application decide what to do i.e.
-"IgnoreWord" - ignoring the wrong word
-"AddWord" - adding an own word to replace the wrong word
-"ReplaceWord" - replacing the wrong word with a word from the suggestions
At this point, you tell the library to continue spellchecking by using "ContinueDetection". You should also verify if spellchecking has been terminated by using "IsSpellingFinished".
At any time, you can interrupt spellchecking by using "UnloadDict". This will be useful in a future release of the library so you can unload an English dictionary and to replace it with, for instance, a French dictionary without exiting your own application. However, before unloading the dictionary, you should verify if a dictionary has already been loaded or not by using "IsDictionaryLoaded".
In the help-file, you can find more important information as to the methods/properties available. Please also check out the two sample-projects present in the attachment where the source-code has been fully commented. One is using a classic spellchecking-interface and the other one is using context-menus.
Other comments....
This first release has some limitations, such as support only for English and the need for a textbox-control. However, I will add other features in the future, for instance:
-support for other languages
-dictionary-tools (for creating dictionaries) - will be an external program
-possibility to add a user-dictionary
-possibility to limit amount of suggestions produced by the library (by using a "ranking-system")
-no further need for a textbox-control in your application. Your application will be able to pass on to the library only the word(s) you wish to spellcheck and the library will only return the suggestion(s). In this way, the spellchecker-library will not "interfere" with your application and you can use whichever control you prefer although you as a developer has to take care of the words to be passed on the library for verification.
-spellchecking "on the fly"
-extended error-handling
A few notes regarding dictionaries....
The English dictionary supplied with the library is composed of nearly 70'000 words. Dictionaries to be used with the library must be sorted and each word in the dictionary must use LF = chr(10) as line-endings. In addition, the dictionary should be saved as UTF-8.
From the dictionary, a KeyMap is created using either a Soundex - or a DoubleMetaphone-algorithm. In this moment, the KeyMap is being furnished with the library and loaded as an external file but future releases might create it on the fly (or at least an option to do so). With next release, I will add a utility, to be run from the Desktop, which will let you create your own dictionary and corresponding KeyMap which are compatible with WMSpellChecker.
Unlike English and Scandinavian ones, dictionaries for German and Latin languages such as Spanish, Italian and French will probably be rather large. This is due to the fact that German, Italian and other similar languages use a lot of suffixes, for instance when creating verbs. In order to overcome this, certain famous spellcheckers such as ASpell, ISpell, HunSpell (used by OpenOffice) have implemented dictionaries which mostly contain only the base-form of words/verbs. However, they use a supplementary file called "affix" which contains a lot of grammar-rules and this file together with the simplified dictionary overcomes the problem of large dictionaries. However, I believe this system is probably rather memory- and performance-hungry and might not be the best solution for Windows Mobile and PPC. However, maybe in the future I will look into this.
Another negative side-effect of using a too large dictionary is that said dictionary may include more obscure words which will increase the risk that the spelling-engine will "miss" real-word errors. For instance, the word wether illustrates this. The word is, arguably, so obscure that any occurrence of wether in a passage is more likely to be a misspelling of weather or whether than a genuine occurrence of wether, so that a spellchecker that did not have the word in its dictionary would do better than one that did.
Conclusion....
The library can be used with projects developed with Basic4ppc (PPC and Desktop) but should also work with projects created in Visual Studio and SharpDevelop (using VB.NET and C#). The library has been compiled targeting Framework Version 2.0.
Library-version: 1.0
Helpfile-version: 1.0
As mentioned before, this is my first serious library. Please check it out and let me know how well it integrates in your applications.
Please also give me feedback, suggestions for improvements, missing features, bug-reports etc.
The idea is to add spelling-support for other languages as well and here I might need some help from end-users. I will let you know.
UPDATE - 17/08/2009: I will in the next days release an updated version with support for other languages as well (starting with French, German, Swedish and Spanish).
Enjoy!
Rgds,
Tilleke
Reserved for future use
Hopefully this evening or by the latest tomorrow, I will upload a new release of the spellchecking-library :
1) which will permit you to pass on a word to verify to the library and the library will return the suggestions without "interfering" with your application. In this way, there is no further need for a textbox-control in your application and you can apply spellchecking to other controls as well (such as the Webbrowser-control).
2) if I find the time, I will add other languages as well in above release otherwise this will follow in the next release.
By the way, has anyone tested it yet? If yes, does it work? Any problems? Please let me know.
rgds,
tilleke
tilleke,
Thanks for this. Looks great, especially the foriegn language capability. I would like to add Thai and Lao to the libraries.
Thanks.
Hmm..I'd love to be able to add support for Thai and Lao but I foresee a few problems:
1) in order to do so, I would need word lists (dictionaries) in those languages and which would be free to use/distribute. If you have any, please let me know. I tried to google for some but I couldn't find any.
2) I couldn't locate an emulator supporting Thai or Lao which I would need to work with Thai-fonts. I guess there must be some kind of support for UniCode:
ก ข ฃ ค ฅ ฆ ง จ ฉ ช ซ ฌ ญ ฎ ฏ ฐ
ฑ ฒ ณ ด ต ถ ท ธ น บ ป ผ ฝ พ ฟ ภ
ม ย ร ฤ ฤๅ ล ฦ ฦๅ ว ศ ษ ส ห ฬ อ ฮ
Click to expand...
Click to collapse
In any case, if one found a dictionary then maybe the font problem could be resolved in one way or another
3) I don't know if the "techniques" mentioned by me in my first post, can be applied to the languages of Thai and Lao...
nagbenjy said:
tilleke,
Thanks for this. Looks great, especially the foriegn language capability. I would like to add Thai and Lao to the libraries.
Thanks.
Click to expand...
Click to collapse
For Thai there are a couple of SIPs available - Thaiwince and Thai-G. I don't have the links handy, but a search on Google will find them.
I don't use either of them. What I did was copy tahoma and tahomabd from the WINDOWS\FONTS folder on the desktop. Opened them in font creator and added the Thai and Lao fonts. I then copied the new fonts to the phone WINDOWS directory overwriting the existing fonts. I use Resco Keyboard Pro to enter Thai and Lao text.
I can post the fonts and the Lao Language skin if you want them. I will aso find some word lists. BUT after thinking more about your methodology in your first post, I don't think it will work. Thai and Lao only have spaces at the end of phrases and sentences, not between words.
Thanks.
Interesting. In any case, I found a worlist for Thai If you send me by PM your e-mail address, then I can send it to you and you can let me know if it is any good.
Out of curiosity: How do you write in Thai the following sentences?
"Today the sun is shining. I think I will go to the beach with my friends. Do you want to come with me?"
nagbenjy said:
Thai and Lao only have spaces at the end of phrases and sentences, not between words.
Thanks.
Click to expand...
Click to collapse
BTW, do you know if Thai-SIPS (Thaiwince and Thai-G) or keyboards such as the one mentioned by you, Resco Keyboard Pro, insert the Unicode Character 'ZERO WIDTH SPACE' (U+200B) between words. If it does, then one could simplify the spell-checking.
See this page for further information:
http://blogamundo.net/dev/2006/12/28/the-zero-width-space/
Originally Posted by nagbenjy
Thai and Lao only have spaces at the end of phrases and sentences, not between words.
Thanks.
Click to expand...
Click to collapse
I am still looking into the the ZERO WIDTH SPACE and will reply later.
In reply to:
"Today the sun is shining. I think I will go to the beach with my friends. Do you want to come with me?"
Depends on where you live, hot climate or cold climate. For cold climate where the sun hardly shines:
"วันนี้มีแสงแดด ผมคิดว่าจะไปชายหาดกับเพื่อน คุณอยากไปด้วยไหม"
Rough transcription, no breaks separating words, no punctuation:
"wanneemiisaengdaed phomkidwajapaichaaihaadkapphuon khunyaakpaiduaymai"
breaks separating words:
"วัน นี้ มี แสง แดด ผม คิด ว่า จะ ไป ชาย หาด กับ เพื่อน. คุณ อยาก ไป ด้วย ไหม?"
wan nee mii saeng daed. phom kid wa ja pai chaaihaad kap phuon. khun yaak pai duay mai?
Hot climate:
"วันนี้แดดจ้า ผมคิดว่าจะไปชายหาดกับเพื่อน คุณอยากไปด้วยไหม"
Rough transcription, no breaks separating words, no punctuation
wanneedaedjaa phomkidwajapaichaaihaadkapphuon khunyaakpaiduaymai
breaks separating words:
วัน นี้ มี แสง แดด ผม คิด ว่า จะ ไป ชาย หาด กับ เพื่อน. คุณ อยาก ไป ด้วย ไหม?
wan nee mii daed jaa. phom kid wa ja pai chaaihaad kap phuon. khun yaak pai duay mai?
Thanks for the link, interesting
NAG
Update - 17/08/2009: - I will in the next days release an updated version with support for other languages as well (starting with French, German, Swedish and Spanish).
In this regard, I need some help with verifying that the suggested replacements generated by the spellchecking-engine are accurate and reasonable. I need to verify Spanish, French and German so if (any) above languages is (are) your mother-tongue(s) or if you know them very well, please send me a PM and I will send you an application that can be run on a normal PC-desktop (Windows).
Ahh this is your home ...
Funny that you're working on a spell checker as myself I've been looking for a replacement to phatspell for a long time and then gave up. You could find my threads http://forum.xda-developers.com/showthread.php?t=350563.
....
Hal_rr:
this project (library) is more intended for fellow developers who wish to add spellchecking to their applications. For end-users, there is not much use of this library since it's not a standalone program.
For the time being, this project is on hold (although it has evolved a lot compared to the features described in my first post/introduction). However if a developer is interested in an updated version, just let me know.
Who knows, I might one day write a small texteditor with spellchecking support, just for the fun of it.
According to Microsoft evangelist Martin Esmann from Microsoft Denmark, the phone will also launch in Denmark in October.
However, the phone will not have a localized Marketplace, so only the international marketplace will be available. But Zune Marketplace will not be available in October in Denmark
Source: (Danish) http://www.computerworld.dk/blog/studblog/100275
Just talked to another from Microsoft Denmark, yep, it's clearly launching here in October as well.
I hope this means we will get Windows Phone 7 in Sweden in October too
Windcape said:
only the international marketplace will be available. [/url]
Click to expand...
Click to collapse
What in the world does this mean? There's "World view" in Marketplace which means you can buy apps from any market if you want. The problem is adding a credit card for that, because you'll need an address in a country which is officially supported.
Another question is, will it have Danish dictionaries? Does it mean that official "Language support" only describes display language?
vangrieg said:
What in the world does this mean? There's "World view" in Marketplace which means you can buy apps from any market if you want. The problem is adding a credit card for that, because you'll need an address in a country which is officially supported.
Click to expand...
Click to collapse
Denmark will be officially supported, but we'll be forced to use the marketplace in English until the Danish translations are out.
So what I'm thinking is the WP7 will "launch" (be available for purchase) in several more than the 17 announced countries, but won't have localizations ready for the marketplace, and possible have Zune limits.
They'll probably accept VISA, which everybody have here. Paying with cash is left for 3rd world countries like USA
vangrieg said:
Another question is, will it have Danish dictionaries? Does it mean that official "Language support" only describes display language?
Click to expand...
Click to collapse
Can't say.
Windcape said:
Denmark will be officially supported, but we'll be forced to use the marketplace in English until the Danish translations are out.
Click to expand...
Click to collapse
That's not support of Danish marketplace sorry, everybody just gets access to any marketplace he wants, there's nothing new there. If it's not supported you just can't enter your credit card details. So I think there's some confusion there.
vangrieg said:
That's not support of Danish marketplace sorry, everybody just gets access to any marketplace he wants, there's nothing new there. If it's not supported you just can't enter your credit card details. So I think there's some confusion there.
Click to expand...
Click to collapse
No, they said that we're meant to be able to buy applications from day one. We'll also be able to SELL applications from day one, as Danish developers!
It's only localization that's missing.
Well, developers from Russia can sell applications from day one as well, registration is open here and it was announced initially. You have some big (in WM terms) names here like SPB Software, so Microsoft made sure they can sell stuff.
With regard to localization, don't you ever use Danish? At work, for example?
Depends if it's necessary. If we can get away with English, it's usually fine
So it is necessary sometimes. And how are you going to type things when it is?
huh? I don't understand what you mean here? We can still write localization support for the individual applications, even the marketplace isn't localized yet.
Windcape said:
huh? I don't understand what you mean here?
Click to expand...
Click to collapse
I mean the keyboard and the autocorrection dictionaries, who cares about UI languages (that is, I do, but I'll use English anyway).
No idea if the phones will be sold with language support, I was mostly worrying about the marketplace.
But we use the roman alphabet, so it's not that big a hassle.
What about æ, ø and å? Does QWERTY layout work as well for you as it does for English? What about autocorrection? I mean, I understand you can use it, but isn't it a hassle to dig two levels deep each time you need to type that special character?
vangrieg said:
What about æ, ø and å? Does QWERTY layout work as well for you as it does for English? What about autocorrection? I mean, I understand you can use it, but isn't it a hassle to dig two levels deep each time you need to type that special character?
Click to expand...
Click to collapse
You just need to hold the "o" button in 2 secs (or something like that) to get up characters like ø and ö. The same with "a" to get å, ä and æ.
We use QWERTY layout in both Sweden and Denmark.
Even if it's not localized, we can use Old Scandinavia, such as oe = ø, aa = å, ae = æ, or use umlaut to represent the same characters.
Best news for a long time
This is great news, I do personally not care to much about the danish keyboard layout, as long as the characters is possible to find, it is not much different from my old Blue Angel. I just hope that I will be able to bay one of those with a slide out keyboard.
Looking forward to Mango
Do you think you guys will jump straight onto Mango then? Or will you 7.0 devices which get updated pretty soon after to 7.1?
Casey
My Omnia 7 already has NoDo so... Mango ASAP
Scoot and I are wanting to pare down the size of the all language Cyanogen release builds. The best way to do that is to obviously remove some languages. These are what we have discussed keeping so far. If you have a need for one that is not listed add a reply with the language you want to keep.
English
Spanish
French
German
Dutch
Italian
Portugeese
Arabic
Romanian
Russian
Polish
Czech
Hebrew
Chineese
Greek
And I know I am missing a few so please do request them!
Does anyone use vietnamese,chineese,japanese or korean? I know typically people in those countries already have phones better than our kaisers lol.
Edit: Due to the large list of languages that we need to keep. I think we should do a Eurasia pack, English,Spanish and French and all languages as per suggestion earlier. That would be do-able.
Part of why its so huge our builds are... well look at english for instance. There are like 10 different versions listed.
aceoyame said:
Part of why its so huge our builds are... well look at english for instance. There are like 10 different versions listed.
Click to expand...
Click to collapse
stupid question cant we have something like a language pack repository? just like ubuntu? so people could download an androidupdate and use only the language they want?
hi.. i want Czech please
albertorodast2007 said:
stupid question cant we have something like a language pack repository? just like ubuntu? so people could download an androidupdate and use only the language they want?
Click to expand...
Click to collapse
truth
I wanted to go for that idea a long time ago but it doesnt work that way sadly. The support has to be compiled into the framework. So at the time the build is compiled you get to choose what languages but thats it.
thats a shame, this would strip down the builds a lot... ill stay with the eng then
Dutch, of course !! brought to the world such nice words like: apartheid, boss and expressions like: to go dutch...need I say more ?? )
Goldfingerz said:
Dutch, of course !! brought to the world such nice words like: apartheid, boss and expressions like: to go dutch...need I say more ?? )
Click to expand...
Click to collapse
Yes, I knew I was forgetting what language was spoke in the netherlands. I just had a brain fart and couldnt remember it lol. I am adding it to the list.
Michga said:
hi.. i want Czech please
Click to expand...
Click to collapse
Added to the list.
right !
aceoyame said:
Yes, I knew I was forgetting what language was spoke in the netherlands. I just had a brain fart and couldnt remember it lol. I am adding it to the list.
Click to expand...
Click to collapse
Okay, now you're talking !
I would appreciate it if you could add Greek too....
English
Sent from my HERO200 using XDA App
Hi I need Hebrew, please
thoughtlesskyle said:
English
Click to expand...
Click to collapse
That's the first to go
Can we add ebonics to this I want to get down with my homies
Sent from my HERO200 using XDA App
I like the way dzo did for Fresh Froyo, with multiple variants: US (English and Spanish), EU (Euro-languages), and FULL (all languages). My own personal Fresh Froyo build is english only, with one wallpaper, no live wallpapers, and no apps that I can install after installation. A small /system means we can have a larger /data partition.
thoughtlesskyle said:
Can we add ebonics to this I want to get down with my homies
Click to expand...
Click to collapse
Added dis here as requested. brace yourself foo'!
Chinese please!
Added Greek,Hebrew and Chineese to the list. Added a suggestion for regional compiles.
Hi. I have HTC Mozart from Orange UK and I can not change English display languge for Polish.
Is no Polish in Settings -> Region & Language -> Display Language.
Any idea how to install it?
windows phone 7 does not currently support that language, so no there is no way of doing it. though i've heard of a homebrew application which will let you use a keyboard which has been designed for it, and with copy/paste around the corner, when the homebrew xap's work again afterwards, then you should see them appear.
the other way is with the bigger update coming 2nd half of the year which should bring official support.
The Gate Keeper said:
windows phone 7 does not currently support that language, so no there is no way of doing it. though i've heard of a homebrew application which will let you use a keyboard which has been designed for it, and with copy/paste around the corner, when the homebrew xap's work again afterwards, then you should see them appear.
the other way is with the bigger update coming 2nd half of the year which should bring official support.
Click to expand...
Click to collapse
But orange PL have the same phone with Polish languge...
I have this same problem
As far as I've read, Windows Phone 7 only fully supports a few languages for this first release (English, French, Italian, German and Spanish), and Polish is not included. If the phones at the store support Polish, then its possible that their display units were customized to support it, but maybe their Retail phones were not. In other words, WP7 direct from Microsoft does not support Polish, but it's possible that Orange or the phone manufacturer customized certain aspects of their devices, and you'd have to ask your carrier/manufacturer about those customizations.
I would suggest going back to the store you got it from and inquiring about it.
Also, the big update coming later this year does not include support for Polish as well.
prjkthack said:
As far as I've read, Windows Phone 7 only fully supports a few languages for this first release (English, French, Italian, German and Spanish), and Polish is not included. If the phones at the store support Polish, then its possible that their display units were customized to support it, but maybe their Retail phones were not. In other words, WP7 direct from Microsoft does not support Polish, but it's possible that Orange or the phone manufacturer customized certain aspects of their devices, and you'd have to ask your carrier/manufacturer about those customizations.
I would suggest going back to the store you got it from and inquiring about it.
Also, the big update coming later this year does not include support for Polish as well.
Click to expand...
Click to collapse
But Spanish is not supported in all WP7, for example I cant set Spanish in an HTC Surround. But in a Samsung Focus I can.
jaraya13 said:
But Spanish is not supported in all WP7, for example I cant set Spanish in an HTC Surround. But in a Samsung Focus I can.
Click to expand...
Click to collapse
that is the fault of the carrier/oem. it is actually supported, but for some weird reason they (carrier/oem) turned it off. hopefully MS forces the multi-lingual capabilities and it can't be override...
The Gate Keeper said:
that is the fault of the carrier/oem. it is actually supported, but for some weird reason they (carrier/oem) turned it off. hopefully MS forces the multi-lingual capabilities and it can't be override...
Click to expand...
Click to collapse
So this means that a registry tweak can do the job (enable other languages)? I am wondering why someone hasn't found it yet...
yes a registry tweak can fix it, but it would be for languages only already built into the phone. The additional languages that you're wondering about can't be enabled as they aren't in the phone. Though this is also a questionable statement as I remember seeing a homebrew application keyboard with a lot of support for other languages.
The Gate Keeper said:
yes a registry tweak can fix it, but it would be for languages only already built into the phone. The additional languages that you're wondering about can't be enabled as they aren't in the phone. Though this is also a questionable statement as I remember seeing a homebrew application keyboard with a lot of support for other languages.
Click to expand...
Click to collapse
Thanks for the reply but wait. What I need in the HTC Surround is Spanish, it has to be definitely built in...
which is true based on versions in other countries. but yea, i'm not sure how to get it enabled. i wonder if you'll get it as part of the no-do update?
The Gate Keeper said:
that is the fault of the carrier/oem. it is actually supported, but for some weird reason they (carrier/oem) turned it off. hopefully MS forces the multi-lingual capabilities and it can't be override...
Click to expand...
Click to collapse
I hpoe so.
It disappointing that carriers still have the power to modify the system this much.
Does anyone know if there is chinese language support in the T mobile version or the HTC unlocked version? Option to turn all the menu to chinese?
Nariyasu Heseri said:
Mod Edit: Sorry for my useless post.
Click to expand...
Click to collapse
Do you realize that HTC is not much different than Huawei right? HTC originated from Taiwan.
HTC Corporation (Chinese: 宏達國際電子股份有限公司; pinyin: Hóngdá Guójì Diànzǐ Gǔfèn Yǒuxiàn Gōngsī), stylised as hтc, is a Taiwanese consumer electronics company headquartered in Xindian District, New Taipei City, Taiwan. Founded in 1997, HTC began as an original design manufacturer and original equipment manufacturer, designing and manufacturing devices such as mobile phones and tablets.
Can someone else please verify that the unlocked version doesn't support Chinese language?
I just used the ( Language Enabler ) For use my HTC 10 in my native language Kurdish and it was worked for some apps that support Kurdish, You can try it i think this app is Fully enables Chinese language if the frameware support Chinese officially! OR you can flash Viper10 Custom Rom all languages was enabled on that great Rom.
here the Google play link : https://play.google.com/store/apps/details?id=com.wanam&hl=en