]> gitweb.fperrin.net Git - DictionaryPC.git/log
DictionaryPC.git
7 years agoFix it-noun parser.
Reimar Döffinger [Wed, 5 Oct 2016 22:27:21 +0000 (00:27 +0200)]
Fix it-noun parser.

Removes mnull entries in EN-IT dictionary.

7 years agoFix typo, not sure what it fixes though.
Reimar Döffinger [Wed, 5 Oct 2016 20:47:17 +0000 (22:47 +0200)]
Fix typo, not sure what it fixes though.

7 years agoAdd commented out possible future improvements
Reimar Döffinger [Wed, 5 Oct 2016 19:59:42 +0000 (21:59 +0200)]
Add commented out possible future improvements

7 years agoPartial progress to fix frwiktionary parsing.
Reimar Döffinger [Wed, 5 Oct 2016 19:56:18 +0000 (21:56 +0200)]
Partial progress to fix frwiktionary parsing.

7 years agoAdd DE-cmn dictionary to generation list.
Reimar Döffinger [Sun, 5 Jun 2016 12:26:53 +0000 (14:26 +0200)]
Add DE-cmn dictionary to generation list.

7 years agoFix special cases for sub-languages like Mandarin.
Reimar Döffinger [Sun, 5 Jun 2016 12:25:27 +0000 (14:25 +0200)]
Fix special cases for sub-languages like Mandarin.

8 years agoSupport generation FR-AR dictionary.
Reimar Döffinger [Fri, 15 Apr 2016 19:19:56 +0000 (21:19 +0200)]
Support generation FR-AR dictionary.

8 years agoAdd missing | in pattern.
Reimar Döffinger [Sat, 19 Mar 2016 21:20:07 +0000 (22:20 +0100)]
Add missing | in pattern.

8 years agoSome bugfixes for generation script.
Reimar Döffinger [Sun, 6 Mar 2016 14:02:56 +0000 (15:02 +0100)]
Some bugfixes for generation script.

Reset variables and fix second stop list.

8 years agoAdd two more languages to generation list.
Reimar Döffinger [Sun, 6 Mar 2016 13:34:01 +0000 (14:34 +0100)]
Add two more languages to generation list.

8 years agoAdd support to generate pure translation-to-translation dictionaries.
Reimar Döffinger [Mon, 11 Jan 2016 19:42:39 +0000 (20:42 +0100)]
Add support to generate pure translation-to-translation dictionaries.

8 years agoIgnore any local .class files.
Reimar Döffinger [Mon, 11 Jan 2016 19:26:23 +0000 (20:26 +0100)]
Ignore any local .class files.

8 years agoMore generic run scripts.
Reimar Döffinger [Wed, 6 Jan 2016 16:15:30 +0000 (17:15 +0100)]
More generic run scripts.

8 years agoMore robust compilation script.
Reimar Döffinger [Wed, 6 Jan 2016 16:10:12 +0000 (17:10 +0100)]
More robust compilation script.

8 years agoFix compilation.
Reimar Döffinger [Wed, 6 Jan 2016 16:09:59 +0000 (17:09 +0100)]
Fix compilation.

8 years agoFix filtering out translation from French HTML.
Reimar Döffinger [Thu, 17 Dec 2015 22:07:02 +0000 (23:07 +0100)]
Fix filtering out translation from French HTML.

8 years agoRe-add some English dictionaries after fixing them.
Reimar Döffinger [Wed, 16 Dec 2015 22:37:40 +0000 (23:37 +0100)]
Re-add some English dictionaries after fixing them.

8 years agoFix for splitting Mandarin/Cantonese/...
Reimar Döffinger [Wed, 16 Dec 2015 22:35:44 +0000 (23:35 +0100)]
Fix for splitting Mandarin/Cantonese/...

8 years agoFix splitting of Greek/Ancient Greek.
Reimar Döffinger [Wed, 16 Dec 2015 21:51:03 +0000 (22:51 +0100)]
Fix splitting of Greek/Ancient Greek.

8 years agoImprove tokenizer speed.
Reimar Döffinger [Mon, 14 Dec 2015 22:49:56 +0000 (23:49 +0100)]
Improve tokenizer speed.

8 years agoUse default Java Collator.
Reimar Döffinger [Sun, 13 Dec 2015 14:10:31 +0000 (15:10 +0100)]
Use default Java Collator.

8 years agoSwitch to newer icu4j to fix hang bugs with EN-ZH.
Reimar Döffinger [Sun, 13 Dec 2015 13:45:13 +0000 (14:45 +0100)]
Switch to newer icu4j to fix hang bugs with EN-ZH.

8 years agoFix parsing of examples with multiline foreign part.
Reimar Döffinger [Sun, 13 Dec 2015 12:22:42 +0000 (13:22 +0100)]
Fix parsing of examples with multiline foreign part.

8 years agoMinor code cleanup.
Reimar Döffinger [Sun, 13 Dec 2015 01:42:54 +0000 (02:42 +0100)]
Minor code cleanup.

8 years agoAvoid replaceAll.
Reimar Döffinger [Sun, 13 Dec 2015 00:07:40 +0000 (01:07 +0100)]
Avoid replaceAll.

It uses regexp and is horribly slow, so use replace
where it works just as well.

8 years agoFree some memory as early as possible.
Reimar Döffinger [Sun, 13 Dec 2015 00:06:44 +0000 (01:06 +0100)]
Free some memory as early as possible.

8 years agoFix German name for latin.
Reimar Döffinger [Sat, 12 Dec 2015 20:02:05 +0000 (21:02 +0100)]
Fix German name for latin.

Should fix almost all words missing in the DE-LA dictionary.

8 years agoFix compilation against latest newformat branch.
Reimar Döffinger [Sat, 12 Dec 2015 15:11:36 +0000 (16:11 +0100)]
Fix compilation against latest newformat branch.

8 years agoEncode URLs as ASCII, avoid UTF-8.
Reimar Döffinger [Wed, 9 Dec 2015 17:20:49 +0000 (18:20 +0100)]
Encode URLs as ASCII, avoid UTF-8.

This is necessary for links to work on
Android 2.x.

8 years agoImprovements to wikisplit code.
Reimar Döffinger [Tue, 8 Dec 2015 18:56:51 +0000 (19:56 +0100)]
Improvements to wikisplit code.

8 years agoSwitch script to generate version 7 zips.
Reimar Döffinger [Tue, 8 Dec 2015 05:17:48 +0000 (06:17 +0100)]
Switch script to generate version 7 zips.

8 years agoSupport v006 and v007 dictionary formats.
Reimar Döffinger [Mon, 7 Dec 2015 15:36:22 +0000 (16:36 +0100)]
Support v006 and v007 dictionary formats.

8 years agoSupport generating DE-JA and DE-RU dictionaries.
Reimar Döffinger [Tue, 17 Nov 2015 13:00:16 +0000 (14:00 +0100)]
Support generating DE-JA and DE-RU dictionaries.

8 years agoSupport generating Ancient Greek dictionary.
Reimar Döffinger [Sun, 11 Oct 2015 17:50:11 +0000 (19:50 +0200)]
Support generating Ancient Greek dictionary.

8 years agoAdd EN-cmn generation.
Reimar Döffinger [Thu, 24 Sep 2015 19:34:21 +0000 (21:34 +0200)]
Add EN-cmn generation.

8 years agoAdd FR-* and IT-* generation support to script.
Reimar Döffinger [Mon, 14 Sep 2015 19:34:42 +0000 (21:34 +0200)]
Add FR-* and IT-* generation support to script.

8 years agoAdd forgotten dictlist files.
Reimar Döffinger [Mon, 14 Sep 2015 17:45:43 +0000 (19:45 +0200)]
Add forgotten dictlist files.

8 years agoUpdate for new dictionary version URL.
Reimar Döffinger [Sat, 12 Sep 2015 11:30:51 +0000 (13:30 +0200)]
Update for new dictionary version URL.

8 years agoAdd generation of DE-* dictionaries.
Reimar Döffinger [Sat, 12 Sep 2015 11:30:30 +0000 (13:30 +0200)]
Add generation of DE-* dictionaries.

8 years agoExpand stoplist.
Reimar Döffinger [Sat, 5 Sep 2015 11:02:24 +0000 (13:02 +0200)]
Expand stoplist.

8 years agoSome fixes for dictionary generation script.
Reimar Döffinger [Sat, 5 Sep 2015 10:42:12 +0000 (12:42 +0200)]
Some fixes for dictionary generation script.

8 years agoFix WiktionarySplitter breakage.
Reimar Döffinger [Sat, 5 Sep 2015 10:41:19 +0000 (12:41 +0200)]
Fix WiktionarySplitter breakage.

8 years agoRemove dummy code that makes no sense/does not work.
Reimar Döffinger [Sat, 5 Sep 2015 10:22:21 +0000 (12:22 +0200)]
Remove dummy code that makes no sense/does not work.

8 years agoTry filtering out anagrams from FR dictionary.
Reimar Döffinger [Fri, 28 Aug 2015 11:51:39 +0000 (13:51 +0200)]
Try filtering out anagrams from FR dictionary.

8 years agoGenerate more single-language dictionaries.
Reimar Döffinger [Fri, 28 Aug 2015 11:03:14 +0000 (13:03 +0200)]
Generate more single-language dictionaries.

8 years agoPartial support for Spanish Wiktionary.
Reimar Döffinger [Fri, 28 Aug 2015 11:02:47 +0000 (13:02 +0200)]
Partial support for Spanish Wiktionary.

8 years agoAlso accept language variants as Spanish.
Reimar Döffinger [Fri, 28 Aug 2015 10:50:18 +0000 (12:50 +0200)]
Also accept language variants as Spanish.

8 years agoHacks to support Spanish wiktionary.
Reimar Döffinger [Fri, 28 Aug 2015 10:37:25 +0000 (12:37 +0200)]
Hacks to support Spanish wiktionary.

8 years agoNeed at least 4GB heap to create dictionaries.
Reimar Döffinger [Fri, 28 Aug 2015 10:22:43 +0000 (12:22 +0200)]
Need at least 4GB heap to create dictionaries.

8 years agoReport error when hitting end when searching token.
Reimar Döffinger [Fri, 28 Aug 2015 10:21:57 +0000 (12:21 +0200)]
Report error when hitting end when searching token.

8 years agoAdd forgotten compression command.
Reimar Döffinger [Fri, 28 Aug 2015 08:33:10 +0000 (10:33 +0200)]
Add forgotten compression command.

8 years agoSmall updates to dictionary generation.
Reimar Döffinger [Fri, 28 Aug 2015 04:44:02 +0000 (06:44 +0200)]
Small updates to dictionary generation.

8 years agoAdd commons-lang3.jar to classpath.
Reimar Döffinger [Thu, 27 Aug 2015 20:38:38 +0000 (22:38 +0200)]
Add commons-lang3.jar to classpath.

8 years agoUpdate for new dictionary release URL.
Reimar Döffinger [Thu, 27 Aug 2015 17:08:16 +0000 (19:08 +0200)]
Update for new dictionary release URL.

8 years agoAdd comment about hang issue.
Reimar Döffinger [Thu, 27 Aug 2015 17:04:23 +0000 (19:04 +0200)]
Add comment about hang issue.

8 years agoAdd script to help with dictionary generation.
Reimar Döffinger [Thu, 27 Aug 2015 16:05:57 +0000 (18:05 +0200)]
Add script to help with dictionary generation.

Also update en and sv stoplists and parse the
Spanish wiktionary a bit.

8 years agoScript adds/improvements dictionary generation.
Reimar Döffinger [Wed, 26 Aug 2015 20:34:44 +0000 (22:34 +0200)]
Script adds/improvements dictionary generation.

8 years agoDownload latest wiktionary files.
Reimar Döffinger [Wed, 26 Aug 2015 20:21:13 +0000 (22:21 +0200)]
Download latest wiktionary files.

The old ones referenced no longer exist,
so just try with the latest ones.

8 years agoAdd script to update dictionary list in app.
Reimar Döffinger [Wed, 26 Aug 2015 20:20:48 +0000 (22:20 +0200)]
Add script to update dictionary list in app.

8 years agoUpdate file location URL.
Reimar Döffinger [Mon, 24 Aug 2015 20:14:10 +0000 (22:14 +0200)]
Update file location URL.

8 years agoAdd horrible but working compile/run scripts.
Reimar Döffinger [Mon, 24 Aug 2015 19:31:01 +0000 (21:31 +0200)]
Add horrible but working compile/run scripts.

8 years agoReplace com.sun.xml.internal.rngom.util.Uri.
Reimar Döffinger [Mon, 24 Aug 2015 19:29:49 +0000 (21:29 +0200)]
Replace com.sun.xml.internal.rngom.util.Uri.

I have no idea where that package can be found.

8 years agoDisable some debug code to allow compilation.
Reimar Döffinger [Mon, 24 Aug 2015 19:28:37 +0000 (21:28 +0200)]
Disable some debug code to allow compilation.

10 years agoFixes for Malay$ and reorderings due to new ICU4J.
Thad Hughes [Thu, 26 Dec 2013 01:48:07 +0000 (17:48 -0800)]
Fixes for Malay$ and reorderings due to new ICU4J.

10 years agoUpdate WiktionaryLangs.
Thad Hughes [Tue, 3 Dec 2013 18:34:21 +0000 (10:34 -0800)]
Update WiktionaryLangs.

10 years agoDictionary added by Phil.
Thad Hughes [Fri, 29 Nov 2013 20:44:03 +0000 (12:44 -0800)]
Dictionary added by Phil.

11 years agoUpdated DictionaryBuilder.jar.
Thad Hughes [Sun, 7 Apr 2013 18:01:16 +0000 (11:01 -0700)]
Updated DictionaryBuilder.jar.

11 years agoIT-TR dictionary test.
Thad Hughes [Sun, 7 Apr 2013 17:57:04 +0000 (10:57 -0700)]
IT-TR dictionary test.

11 years agogo
Thad Hughes [Sun, 7 Apr 2013 17:53:36 +0000 (10:53 -0700)]
go

11 years agoFix Malay/Malayalam, add test for "buon g".
Thad Hughes [Wed, 9 Jan 2013 05:53:42 +0000 (21:53 -0800)]
Fix Malay/Malayalam, add test for "buon g".

11 years agoUsing new Chemnitz dictionary.
Thad Hughes [Sat, 5 Jan 2013 18:13:16 +0000 (10:13 -0800)]
Using new Chemnitz dictionary.

11 years agoFix name of chemnitz dictionary.
Thad Hughes [Sat, 5 Jan 2013 18:00:39 +0000 (10:00 -0800)]
Fix name of chemnitz dictionary.

11 years agoFix AF-EN test.
Thad Hughes [Sat, 5 Jan 2013 17:57:44 +0000 (09:57 -0800)]
Fix AF-EN test.

11 years agoFixed comment for German dictionary.
Thad Hughes [Sat, 5 Jan 2013 06:17:34 +0000 (22:17 -0800)]
Fixed comment for German dictionary.

11 years agoEliminated <ref>s.
Thad Hughes [Thu, 3 Jan 2013 18:44:44 +0000 (10:44 -0800)]
Eliminated <ref>s.

11 years agoSkip Italian references.
Thad Hughes [Thu, 3 Jan 2013 05:09:21 +0000 (21:09 -0800)]
Skip Italian references.

11 years agoSplit ZH into yue and cmn, fixed German heading.
Thad Hughes [Thu, 3 Jan 2013 03:35:08 +0000 (19:35 -0800)]
Split ZH into yue and cmn, fixed German heading.

11 years agoFR single lang.
Thad Hughes [Sun, 30 Dec 2012 06:36:12 +0000 (22:36 -0800)]
FR single lang.

11 years agoUpdate URL format and parsing, fix FR handling.
Thad Hughes [Sun, 30 Dec 2012 06:35:44 +0000 (22:35 -0800)]
Update URL format and parsing, fix FR handling.

11 years agoMulti word search now looks for exact matches of TokenRows.
Thad Hughes [Sun, 23 Dec 2012 18:38:28 +0000 (10:38 -0800)]
Multi word search now looks for exact matches of TokenRows.

11 years agoBuilding dicitonaries.
Thad Hughes [Sun, 23 Dec 2012 17:43:29 +0000 (09:43 -0800)]
Building dicitonaries.

11 years agoUpdate to latest wiktionaries, update unit tests, der-top/mid/bottom.
Thad Hughes [Sun, 16 Dec 2012 00:02:53 +0000 (16:02 -0800)]
Update to latest wiktionaries, update unit tests, der-top/mid/bottom.

11 years agoFixed URL encoding in goldens.
Thad Hughes [Sat, 15 Dec 2012 23:34:20 +0000 (15:34 -0800)]
Fixed URL encoding in goldens.

11 years agogo
Thad Hughes [Mon, 3 Dec 2012 21:47:50 +0000 (13:47 -0800)]
go

11 years agoAdded simple parsing logic for DE and IT wiktionaries.
thadh [Sun, 7 Oct 2012 18:36:16 +0000 (11:36 -0700)]
Added simple parsing logic for DE and IT wiktionaries.

11 years agoUpdated test cases to latest wiktionary dumps.
thadh [Thu, 4 Oct 2012 17:19:32 +0000 (10:19 -0700)]
Updated test cases to latest wiktionary dumps.

11 years agoUpdated input locations. Moved pairs in builder.
thadh [Thu, 4 Oct 2012 15:09:10 +0000 (08:09 -0700)]
Updated input locations.  Moved pairs in builder.

11 years agoFixed trailing ,s in italian verb tenses.
thadh [Wed, 3 Oct 2012 23:12:33 +0000 (16:12 -0700)]
Fixed trailing ,s in italian verb tenses.
Hotlink to URL at bottom of HTMLEntry page.
Parsing an only-English dictionary for the first time, yay!

11 years agoFormat links properly.
thadh [Mon, 1 Oct 2012 17:41:33 +0000 (10:41 -0700)]
Format links properly.

11 years agoSynonyms, antonyms.
thadh [Sun, 30 Sep 2012 17:04:56 +0000 (10:04 -0700)]
Synonyms, antonyms.

11 years agoDon't handle it-conj in EnParser.
thadh [Tue, 25 Sep 2012 15:42:10 +0000 (08:42 -0700)]
Don't handle it-conj in EnParser.

11 years agoit-noun.
thadh [Tue, 25 Sep 2012 05:47:22 +0000 (22:47 -0700)]
it-noun.

11 years agoLink forms, page limit arabic, change HTML.
thadh [Tue, 25 Sep 2012 05:29:28 +0000 (22:29 -0700)]
Link forms, page limit arabic, change HTML.

11 years agoPut links into HtmlEntry.
thadh [Tue, 25 Sep 2012 04:43:16 +0000 (21:43 -0700)]
Put links into HtmlEntry.

11 years agoItalian verb conjugations!
thadh [Sun, 23 Sep 2012 15:54:13 +0000 (08:54 -0700)]
Italian verb conjugations!

11 years agoit-conj (most of the way), unicode handling in strings.
thadh [Sat, 22 Sep 2012 19:39:15 +0000 (12:39 -0700)]
it-conj (most of the way), unicode handling in strings.

11 years agoExpand italian test to get verb conjuations.
thadh [Tue, 18 Sep 2012 19:55:59 +0000 (12:55 -0700)]
Expand italian test to get verb conjuations.

11 years agoBasic general functions in WholeSectionParser.
thadh [Tue, 18 Sep 2012 19:46:23 +0000 (12:46 -0700)]
Basic general functions in WholeSectionParser.

11 years agoSkip lang=XX for the lang we care about.
thadh [Tue, 18 Sep 2012 18:51:02 +0000 (11:51 -0700)]
Skip lang=XX for the lang we care about.

11 years agoSkip w: and Image: wikiLinks.
thadh [Tue, 18 Sep 2012 18:36:15 +0000 (11:36 -0700)]
Skip w: and Image: wikiLinks.