]> gitweb.fperrin.net Git - DictionaryPC.git/log
DictionaryPC.git
11 years agoDelete Anagrams and References sections.
thadh [Tue, 18 Sep 2012 18:14:34 +0000 (11:14 -0700)]
Delete Anagrams and References sections.

11 years agoGot rid of Category:.
thadh [Tue, 18 Sep 2012 18:11:24 +0000 (11:11 -0700)]
Got rid of Category:.

11 years agoGet rid of training "en:word" crap.
thadh [Tue, 18 Sep 2012 17:49:56 +0000 (10:49 -0700)]
Get rid of training "en:word" crap.

11 years agoReformat.
thadh [Tue, 18 Sep 2012 17:45:40 +0000 (10:45 -0700)]
Reformat.

11 years agoUpdate unit tests for parsing function name.
thadh [Tue, 18 Sep 2012 17:45:02 +0000 (10:45 -0700)]
Update unit tests for parsing function name.

11 years agoFixed Builder, and escaping arg names.
thadh [Tue, 18 Sep 2012 17:13:51 +0000 (10:13 -0700)]
Fixed Builder, and escaping arg names.

11 years agoHtmlEntries don't count as main entries.
thadh [Tue, 11 Sep 2012 00:46:51 +0000 (17:46 -0700)]
HtmlEntries don't count as main entries.

11 years agoWhitespace.
thadh [Mon, 10 Sep 2012 22:56:40 +0000 (15:56 -0700)]
Whitespace.

11 years agoWhitespace.
thadh [Mon, 10 Sep 2012 22:55:33 +0000 (15:55 -0700)]
Whitespace.

11 years agoUpdate goldens.
thadh [Mon, 10 Sep 2012 22:17:58 +0000 (15:17 -0700)]
Update goldens.

11 years agoAdd some langs (Ancient Greek, Cantonese, Burmese(MY)), WholeSection
thadh [Mon, 10 Sep 2012 22:05:02 +0000 (15:05 -0700)]
Add some langs (Ancient Greek, Cantonese, Burmese(MY)), WholeSection
parser improvements, Splitter improvements.  Builder uses WholeSection
parser.

11 years agoFirst decent implementation of HtmlEntry attached to TokenRow.
thadh [Mon, 10 Sep 2012 03:40:36 +0000 (20:40 -0700)]
First decent implementation of HtmlEntry attached to TokenRow.

11 years agoAdd TA=Tamil language.
thadh [Mon, 20 Aug 2012 03:07:41 +0000 (20:07 -0700)]
Add TA=Tamil language.

11 years agoTest data added.
thadh [Tue, 31 Jul 2012 15:40:55 +0000 (08:40 -0700)]
Test data added.

11 years agoEscape HTML. Test special ISO coding.
thadh [Sat, 28 Jul 2012 01:51:00 +0000 (18:51 -0700)]
Escape HTML.  Test special ISO coding.

11 years agogitignore
thadh [Tue, 24 Jul 2012 00:25:27 +0000 (17:25 -0700)]
gitignore

11 years agoBaseline HTML parsing done, goldens updated!
thadh [Tue, 24 Jul 2012 00:18:29 +0000 (17:18 -0700)]
Baseline HTML parsing done, goldens updated!

11 years agoRefactor code to generate dictionaries to make it all one loop!
thadh [Sun, 22 Jul 2012 01:43:48 +0000 (18:43 -0700)]
Refactor code to generate dictionaries to make it all one loop!

11 years agoUpdate unit tests with new wiki data.
thadh [Sat, 21 Jul 2012 18:06:55 +0000 (11:06 -0700)]
Update unit tests with new wiki data.

11 years agoAdded WholeSection entries and parser.
thadh [Sat, 21 Jul 2012 17:41:21 +0000 (10:41 -0700)]
Added WholeSection entries and parser.
Switched to real xerces parser because Sun fork was crashing on
enwiktionary data.

11 years agoUpdated unit tests, added WholeSectionToHtmlParser.
thadh [Tue, 17 Jul 2012 04:09:50 +0000 (21:09 -0700)]
Updated unit tests, added WholeSectionToHtmlParser.

11 years agoDictionaryBuilder prints sortable langs, JP->JA fix.
Thad Hughes [Sun, 20 May 2012 23:31:08 +0000 (16:31 -0700)]
DictionaryBuilder prints sortable langs, JP->JA fix.

11 years agoBuild fr_de dictionary from enwiktionary, yeah!
Thad Hughes [Fri, 11 May 2012 20:51:32 +0000 (13:51 -0700)]
Build fr_de dictionary from enwiktionary, yeah!

11 years agoUpdated to latest enwiktionary.
Thad Hughes [Thu, 10 May 2012 04:05:15 +0000 (21:05 -0700)]
Updated to latest enwiktionary.

11 years agoUnit tests working, looks like I'd been revamping the parsers.
Thad Hughes [Wed, 9 May 2012 22:09:04 +0000 (15:09 -0700)]
Unit tests working, looks like I'd been revamping the parsers.

12 years agoAdded DictionaryBuilder.jar
Thad Hughes [Fri, 9 Mar 2012 02:15:03 +0000 (18:15 -0800)]
Added DictionaryBuilder.jar

12 years agoUpdate version to v004.
Thad Hughes [Fri, 9 Mar 2012 00:58:31 +0000 (16:58 -0800)]
Update version to v004.

12 years agoFixes to tr= and head= make Arabic,Thai look much better.
Thad Hughes [Thu, 8 Mar 2012 19:33:29 +0000 (11:33 -0800)]
Fixes to tr= and head= make Arabic,Thai look much better.

12 years agoUpdate unit tests for new wiktionary.
Thad Hughes [Thu, 8 Mar 2012 17:56:35 +0000 (09:56 -0800)]
Update unit tests for new wiktionary.

12 years agoBug-fixes to WikiTokenizer (handle weird line-feed), update to newest
Thad Hughes [Thu, 8 Mar 2012 17:15:50 +0000 (09:15 -0800)]
Bug-fixes to WikiTokenizer (handle weird line-feed), update to newest
enwiktionary.

12 years agoEnTranslationToTranslationParser
Thad Hughes [Tue, 6 Mar 2012 00:51:01 +0000 (16:51 -0800)]
EnTranslationToTranslationParser

12 years agoFixed combining marks on Unicode regexes.
Thad Hughes [Tue, 6 Mar 2012 00:50:29 +0000 (16:50 -0800)]
Fixed combining marks on Unicode regexes.

12 years agoUnit tests working again after refactoring!!!
Thad Hughes [Fri, 10 Feb 2012 19:43:15 +0000 (11:43 -0800)]
Unit tests working again after refactoring!!!

12 years agoMajor en refactoring underway.
Thad Hughes [Fri, 10 Feb 2012 18:49:08 +0000 (10:49 -0800)]
Major en refactoring underway.

12 years agoRename enwiktionary package to wiktionary.
Thad Hughes [Fri, 10 Feb 2012 16:51:06 +0000 (08:51 -0800)]
Rename enwiktionary package to wiktionary.

12 years agogitignore
Thad Hughes [Wed, 8 Feb 2012 23:49:42 +0000 (15:49 -0800)]
gitignore

12 years agoPoint unit tests at new wikiSplit/en/.
Thad Hughes [Wed, 8 Feb 2012 23:48:54 +0000 (15:48 -0800)]
Point unit tests at new wikiSplit/en/.

12 years agoSplit EN, DE, IT, FR wiktionaries! Fix splitting to use entire header
Thad Hughes [Wed, 8 Feb 2012 23:45:40 +0000 (15:45 -0800)]
Split EN, DE, IT, FR wiktionaries!  Fix splitting to use entire header
line (hopefully this works ok).

12 years agoTodo
Thad Hughes [Sat, 4 Feb 2012 01:07:51 +0000 (17:07 -0800)]
Todo

12 years agoFix test.
Thad Hughes [Tue, 31 Jan 2012 23:06:26 +0000 (15:06 -0800)]
Fix test.

12 years agoMoved normalization, more tests.
Thad Hughes [Tue, 31 Jan 2012 22:56:05 +0000 (14:56 -0800)]
Moved normalization, more tests.

12 years agoStoplist, more languages...
Thad Hughes [Mon, 30 Jan 2012 06:08:52 +0000 (22:08 -0800)]
Stoplist, more languages...

12 years agozipSize, overrideStoplist-> special isMainEntry, tagalog, trying to
Thad Hughes [Thu, 26 Jan 2012 00:03:03 +0000 (16:03 -0800)]
zipSize, overrideStoplist-> special isMainEntry, tagalog, trying to
count {{t}} but failing

12 years agoAdded Urdu!
Thad Hughes [Tue, 24 Jan 2012 05:33:01 +0000 (21:33 -0800)]
Added Urdu!

12 years agoNewlines in info message.
Thad Hughes [Fri, 20 Jan 2012 01:45:32 +0000 (17:45 -0800)]
Newlines in info message.

12 years agoBetter DictionaryInfo, IndexBuilder counts main TokenRows.
Thad Hughes [Tue, 17 Jan 2012 21:13:38 +0000 (13:13 -0800)]
Better DictionaryInfo, IndexBuilder counts main TokenRows.

12 years agoWiktionary upgrade!
Thad Hughes [Mon, 16 Jan 2012 19:43:27 +0000 (11:43 -0800)]
Wiktionary upgrade!

12 years agoDictionaryInfo has full file URL.
Thad Hughes [Mon, 16 Jan 2012 17:55:25 +0000 (09:55 -0800)]
DictionaryInfo has full file URL.

12 years ago2 types of TokenRow.
Thad Hughes [Mon, 16 Jan 2012 07:14:01 +0000 (23:14 -0800)]
2 types of TokenRow.

Merge branch 'master' of
https://code.google.com/p/quickdic-dictionary.dictionarypc

Conflicts:
src/com/hughes/android/dictionary/engine/DictionaryBuilderMain.java
todo.txt

12 years agoChanging the way dictionaries are indexed (listed), new type of TokenRow
Thad Hughes [Mon, 16 Jan 2012 00:08:07 +0000 (16:08 -0800)]
Changing the way dictionaries are indexed (listed), new type of TokenRow
(to distinguish major from minor entries).

12 years agomore downloads
Thad Hughes [Wed, 11 Jan 2012 22:14:29 +0000 (14:14 -0800)]
more downloads

12 years agotodo comments
Thad Hughes [Fri, 6 Jan 2012 20:53:06 +0000 (12:53 -0800)]
todo comments

12 years agoAdd your own dictionary
Thad Hughes [Thu, 5 Jan 2012 23:51:17 +0000 (15:51 -0800)]
Add your own dictionary

12 years agoChange logging.
Thad Hughes [Thu, 5 Jan 2012 00:38:01 +0000 (16:38 -0800)]
Change logging.

12 years agoChanged ordering of FormOf, handling of FormOf, handing of Encoding.
Thad Hughes [Wed, 4 Jan 2012 19:55:57 +0000 (11:55 -0800)]
Changed ordering of FormOf, handling of FormOf, handing of Encoding.

12 years agoExample splitting fixes, tokenizer newline handling, Chinese
Thad Hughes [Wed, 4 Jan 2012 16:32:09 +0000 (08:32 -0800)]
Example splitting fixes, tokenizer newline handling, Chinese
transliteration unit test.

12 years agoExamples now parsed with dispatch. Better {{l}} and {{term}} handling.
Thad Hughes [Tue, 3 Jan 2012 23:06:04 +0000 (15:06 -0800)]
Examples now parsed with dispatch.  Better {{l}} and {{term}} handling.

12 years agoParse foreign text with new wiki parser.
Thad Hughes [Tue, 3 Jan 2012 05:53:35 +0000 (21:53 -0800)]
Parse foreign text with new wiki parser.

12 years agoFixing goldens after refactoring.
Thad Hughes [Tue, 3 Jan 2012 04:52:45 +0000 (20:52 -0800)]
Fixing goldens after refactoring.

12 years agoMajor refactor in the way wikiText is parsed.
Thad Hughes [Mon, 2 Jan 2012 18:00:35 +0000 (10:00 -0800)]
Major refactor in the way wikiText is parsed.

12 years agoRefactoring wiki parsing, bigtime. Underway, so lots of errors....
Thad Hughes [Fri, 30 Dec 2011 19:36:40 +0000 (11:36 -0800)]
Refactoring wiki parsing, bigtime.  Underway, so lots of errors....

12 years agoMore languages, simpler splitter.
Thad Hughes [Fri, 30 Dec 2011 02:25:29 +0000 (18:25 -0800)]
More languages, simpler splitter.

12 years agoFixed ZH sort ordering (remove spaces).
Thad Hughes [Wed, 28 Dec 2011 23:52:53 +0000 (15:52 -0800)]
Fixed ZH sort ordering (remove spaces).

12 years agoBetter downloading, fix Builder.
Thad Hughes [Wed, 28 Dec 2011 23:52:44 +0000 (15:52 -0800)]
Better downloading, fix Builder.

12 years ago{{defn}} is empty, update to new wiktionary, required processing
Thad Hughes [Wed, 28 Dec 2011 06:07:38 +0000 (22:07 -0800)]
{{defn}} is empty, update to new wiktionary, required processing
{{head}}

12 years agoDownload changes.
Thad Hughes [Wed, 28 Dec 2011 05:04:12 +0000 (21:04 -0800)]
Download changes.

12 years agomore ignore
Thad Hughes [Wed, 28 Dec 2011 01:19:13 +0000 (17:19 -0800)]
more ignore

12 years agoUpgrading wiktionary version....
Thad Hughes [Wed, 28 Dec 2011 01:17:55 +0000 (17:17 -0800)]
Upgrading wiktionary version....

12 years agoStarting callback-based parsing....
Thad Hughes [Tue, 27 Dec 2011 19:14:23 +0000 (11:14 -0800)]
Starting callback-based parsing....

12 years agoStarting callback-based parsing....
Thad Hughes [Tue, 27 Dec 2011 19:14:02 +0000 (11:14 -0800)]
Starting callback-based parsing....

12 years agoAdd script to download data.
Thad Hughes [Mon, 26 Dec 2011 02:20:41 +0000 (18:20 -0800)]
Add script to download data.

12 years agoTest data updates.
Thad Hughes [Mon, 26 Dec 2011 02:14:50 +0000 (18:14 -0800)]
Test data updates.

12 years agotodo
Thad Hughes [Sun, 25 Dec 2011 23:56:54 +0000 (15:56 -0800)]
todo

12 years agoMoved data here.
Thad Hughes [Sun, 25 Dec 2011 08:53:07 +0000 (00:53 -0800)]
Moved data here.

12 years agoNew unit tests, { instead of {{,
Thad Hughes [Sat, 24 Dec 2011 07:36:08 +0000 (23:36 -0800)]
New unit tests, { instead of {{,

12 years agoBetter {{form of}} handling, remove "lang=..."
Thad Hughes [Fri, 23 Dec 2011 02:57:48 +0000 (18:57 -0800)]
Better {{form of}} handling, remove "lang=..."

12 years agoPrintint dictionaries for diff.
Thad Hughes [Wed, 21 Dec 2011 01:30:26 +0000 (17:30 -0800)]
Printint dictionaries for diff.

12 years agoUploader.
Thad Hughes [Wed, 21 Dec 2011 00:54:02 +0000 (16:54 -0800)]
Uploader.

12 years agoFixed {{infl}}
Thad Hughes [Wed, 21 Dec 2011 00:53:50 +0000 (16:53 -0800)]
Fixed {{infl}}

12 years agoShow all rows in foreign lists.
Thad Hughes [Tue, 20 Dec 2011 21:18:17 +0000 (13:18 -0800)]
Show all rows in foreign lists.

12 years agoHandling {{infl}}
Thad Hughes [Tue, 20 Dec 2011 20:50:40 +0000 (12:50 -0800)]
Handling {{infl}}

12 years agoinitialism and acronym.
Thad Hughes [Tue, 20 Dec 2011 17:59:03 +0000 (09:59 -0800)]
initialism and acronym.

12 years agoAdding goldens back?
Thad Hughes [Tue, 20 Dec 2011 17:52:34 +0000 (09:52 -0800)]
Adding goldens back?

12 years agoInitialism, changes in regex matching.
Thad Hughes [Tue, 20 Dec 2011 17:39:57 +0000 (09:39 -0800)]
Initialism, changes in regex matching.

12 years agoAdded goldens for de_wiktionary and zh_wiktionary.
Thad Hughes [Mon, 19 Dec 2011 21:10:43 +0000 (13:10 -0800)]
Added goldens for de_wiktionary and zh_wiktionary.

12 years agoFixed handling of non top level languages inside Translations section.
Thad Hughes [Mon, 19 Dec 2011 21:10:13 +0000 (13:10 -0800)]
Fixed handling of non top level languages inside Translations section.

12 years agoFix enIndex=1, not 2.
Thad Hughes [Mon, 19 Dec 2011 19:08:34 +0000 (11:08 -0800)]
Fix enIndex=1, not 2.

12 years agoAdded outputs to .gitignore
Thad Hughes [Sun, 18 Dec 2011 19:39:30 +0000 (11:39 -0800)]
Added outputs to .gitignore

12 years agoAdded testdata.
Thad Hughes [Sun, 18 Dec 2011 19:39:01 +0000 (11:39 -0800)]
Added testdata.

12 years agoMove test data, fix DictFileParser, fix splitter, fix crash during
Thad Hughes [Sun, 18 Dec 2011 19:38:00 +0000 (11:38 -0800)]
Move test data, fix DictFileParser, fix splitter, fix crash during
weird qualifier.

12 years agoRedo splitter language codes.
Thad Hughes [Sat, 17 Dec 2011 06:14:56 +0000 (22:14 -0800)]
Redo splitter language codes.

12 years agoFixing examples...
Thad Hughes [Sat, 17 Dec 2011 04:12:13 +0000 (20:12 -0800)]
Fixing examples...

12 years agoFirst pass at examples.
Thad Hughes [Sat, 17 Dec 2011 03:53:47 +0000 (19:53 -0800)]
First pass at examples.

12 years agoTokenizer fixes.
Thad Hughes [Sat, 17 Dec 2011 03:13:01 +0000 (19:13 -0800)]
Tokenizer fixes.

12 years agoTest path bug.
Thad Hughes [Sat, 17 Dec 2011 02:15:11 +0000 (18:15 -0800)]
Test path bug.

12 years agoStoplists, fix location of wikisplits.
Thad Hughes [Fri, 16 Dec 2011 19:47:23 +0000 (11:47 -0800)]
Stoplists, fix location of wikisplits.

12 years agoFixing.
Thad Hughes [Wed, 14 Dec 2011 20:04:30 +0000 (12:04 -0800)]
Fixing.

12 years agoReworking handling of foreign section.
Thad Hughes [Wed, 14 Dec 2011 19:09:34 +0000 (11:09 -0800)]
Reworking handling of foreign section.

12 years agoSwitched to logger.
Thad Hughes [Wed, 14 Dec 2011 15:55:55 +0000 (07:55 -0800)]
Switched to logger.

12 years agoFixing tests after moving out data.
Thad Hughes [Wed, 14 Dec 2011 00:02:12 +0000 (16:02 -0800)]
Fixing tests after moving out data.