Revision history - None - origin: https://github.com/google/sling

visit type:

Newer
Older

Revision	Author	Date	Message	Commit Date
2956799	Anders Thorhauge Sandholm	20 March 2019, 21:51:44 UTC	Fix subsumed calculation (#345) * Fix subsumed calculation * Add date special case handling	20 March 2019, 21:51:44 UTC
080c05a	rahul1980	19 March 2019, 14:30:02 UTC	Minor fixes to the Wikicat browser (#344) - Some cosmetic fixes - Fix a counting bug while generating recordio	19 March 2019, 14:30:02 UTC
a6a6775	rahul1980	18 March 2019, 17:29:57 UTC	Enhancements to the Wikicat browser (#343) - Hovering over fact-matching counts now brings up a list of qids that illustrate that count. - While browsing a signature, the user now has an option to generate Wikibot recordio files directly from the browser. - Modify the fact member so that for it stores exemplars for each match type. Further, this list is exhaustive for NEW, ADDITIONAL, and SUBSUMED_BY_EXISTING. This is used in both the new features above. - Simplified the browser code a bit, and removed some unnecessary Javascript.	18 March 2019, 17:29:57 UTC
64c35cf	Michael Ringgaard	15 March 2019, 13:31:01 UTC	Fix the problem with women (#341)	15 March 2019, 13:31:01 UTC
fce18b2	Michael Ringgaard	15 March 2019, 11:49:51 UTC	Infobox aliases and wiki links (#339)	15 March 2019, 11:49:51 UTC
b765579	rahul1980	13 March 2019, 17:45:32 UTC	Various Wikicat fixes (#340) - Omit outputting low-frequency (pid, qid) spans. - Move subsumption checking code to Python - We only get the closure from C++ code now - Subsumption code in Python checks for genre, subclass, part_of, parent_org, located_in, and date subsumption. More properties can be added easily. - Fix a minor bug in the browser. - Expose the Resolve() method in the Python API.	13 March 2019, 17:45:32 UTC
97746ce	Michael Ringgaard	13 March 2019, 13:13:27 UTC	Token styles (#338)	13 March 2019, 13:13:27 UTC
05373f7	Michael Ringgaard	11 March 2019, 09:41:31 UTC	XML frame reader (#337)	11 March 2019, 09:41:31 UTC
cdacfc0	rahul1980	26 February 2019, 22:02:44 UTC	Browser for category parses (#336) Allows browsing by category, signature, or top signatures (denoted by 'top'). Supports coarse and fine signatures, and three metrics to sort the parses with. Allows customization of fact-matching scores.	26 February 2019, 22:02:44 UTC
7c53dec	Anders Thorhauge Sandholm	25 February 2019, 10:13:42 UTC	A few fixes to wikibot.py (#332)	25 February 2019, 10:13:42 UTC
b0388ed	Michael Ringgaard	25 February 2019, 10:00:37 UTC	Check HTTP path after unescaping (#335)	25 February 2019, 10:00:37 UTC
1c09339	rahul1980	21 February 2019, 04:13:35 UTC	Fact-matching statistics for category parsing (#331) * FactMatcher that computes how proposed facts in a parse match with existing facts for the same property. They are classified as new, or matching existing facts (either exactly or via subsumption), or conflicting (e.g. for unique-valued properties) or additional facts. * A workflow task that attaches this information to each parse. Other changes: * Add methods to FactExtractor for: (a) only reporting facts for specified properties. (b) report facts with or without backoff (aka closure). Previously the only supported mode was with backoff. I have confirmed that this change doesn't affect the runtime of the backoff mode. (c) Add a method to check if one value subsumes another. These methods also come with the corresponding Python API methods. * Skip empty parses in the parse generator. * A performance improvement: we replace a sling.Array with a python list, allowing the sling.Store behind the array to be garbage collected. * Store the list of category members in the category frame. These only cover legit members, e.g. they exclude subcategories. * Add 'type of sport' and 'cause of death' to the custom taxonomy. * Replace prior and member_score values with their geometric means. This makes it fair to compare a parse with many spans vs a parse with only a few spans (since the priors and member_scores are multiplicative across spans). * Also attach the coarse signature to each parse. * Allow load_kb() to also take filename arguments, and make it use a global pool of loaded KBs. This way multiple tasks can share a KB, which saves both memory and runtime. * Add a 'skip_generation' flag to the workflow, so we have an option to not run (and instead use the cached output of) the expensive candidate parse generation stage.	21 February 2019, 04:13:35 UTC
87b8666	Michael Ringgaard	06 February 2019, 18:34:19 UTC	Handle Unicode strings in Python frame API (#330)	06 February 2019, 18:34:19 UTC
797ec43	Darren Garvey	01 February 2019, 13:27:17 UTC	Fix benign error in KB UI. (#329) When deleting characters from the search bar the update handler gets a null item. Check it before dereferencing `.ref` on it. For reference, the console error: TypeError: Cannot read property 'ref' of undefined at Object.self.selectedItemChange (kb.js:46) at fn (eval at compile (angular.js:14605), <anonymous>:4:318) at m.d.(:8080/kb/anonymous function) [as itemChange] (http://ajax.googleapis.com/ajax/libs/angularjs/1.5.7/angular.min.js:83:232) at D (angular-material.min.js:13) at N (angular-material.min.js:13) at m.$digest (angular.js:17286) at b.$apply (angular.js:17552) at Pg.$$debounceViewValueCommit (angular.js:27516) at Pg.$setViewValue (angular.js:27488) at HTMLInputElement.l (angular.js:23730)	01 February 2019, 13:27:17 UTC
8f0d22d	Anders Thorhauge Sandholm	28 January 2019, 10:06:26 UTC	Better extraction and upload of birth and death dates. (#328)	28 January 2019, 10:06:26 UTC
7ea2aa6	Jordan Rupprecht	22 January 2019, 21:39:05 UTC	[cpu] Replace _xgetbv identifier with xgetbv (#327)	22 January 2019, 21:39:05 UTC
93c2ab1	Michael Ringgaard	22 January 2019, 09:35:43 UTC	Alias transfer (#326)	22 January 2019, 09:35:43 UTC
e037430	Michael Ringgaard	14 January 2019, 22:53:18 UTC	Update README.md	14 January 2019, 22:53:18 UTC
051b8ad	rahul1980	14 January 2019, 21:50:13 UTC	Fix NotShiftOrMarkDelegate to work even if there is no MARK action (#324) When the training corpora only consists of single-token spans, MARK is not added to the action table. Therefore actions.mark() will return None and break NotShiftOrMarkDelegate. This PR fixes the delegate's behavior in this corner case.	14 January 2019, 21:50:13 UTC
18f0b22	rahul1980	04 January 2019, 22:10:50 UTC	Wikicat parse generator, filter, and signature builder. (#320) * Initial version of the parse generator + ranker + signature producer. - We get about 3.5M filtered parses across 520K acceptable English category titles. * Bug fix in Task::GetInputs()	04 January 2019, 22:10:50 UTC
60643d6	Michael Ringgaard	02 January 2019, 17:08:55 UTC	Template expansion for wikitext (#322)	02 January 2019, 17:08:55 UTC
c335393	Michael Ringgaard	02 January 2019, 15:00:55 UTC	Data files for template extraction (#321)	02 January 2019, 15:00:55 UTC
0e20b4d	Anders Thorhauge Sandholm	21 December 2018, 14:42:59 UTC	Refining birth and death date extraction and updating (#319)	21 December 2018, 14:42:59 UTC
4def25e	Anders Thorhauge Sandholm	19 December 2018, 13:13:35 UTC	Fix ; and : tokens to not be condeos (#318)	19 December 2018, 13:13:35 UTC
4ec8a7f	Anders Thorhauge Sandholm	19 December 2018, 12:35:16 UTC	Extracting birth death dates from English Wikipedia articles (#317)	19 December 2018, 12:35:16 UTC
63130a1	Anders Thorhauge Sandholm	17 December 2018, 15:48:48 UTC	Adding 'figure dash' and 'minus sign' tokens and a few abbreviations (#316) * Adding 'figure dash' and 'minus sign' tokens and a few abbreviation	17 December 2018, 15:48:48 UTC
c63640d	Anders Thorhauge Sandholm	17 December 2018, 10:11:16 UTC	Wikiflow uses latest dump files. Improved wikibot handling of date precision. (#315) * Wikiflow downloads latest dump files by default. * Improved handling of date precision in wikibot.	17 December 2018, 10:11:16 UTC
3a1ea68	Michael Ringgaard	14 December 2018, 13:32:42 UTC	Refactor wiki text parser (#314)	14 December 2018, 13:32:42 UTC
f9163e9	Michael Ringgaard	13 December 2018, 10:38:52 UTC	Treat underscore as letter in tokenization (#313)	13 December 2018, 10:38:52 UTC
66fb7fb	Michael Ringgaard	12 December 2018, 09:41:26 UTC	Matrix-matrix multiplication for integers (#312)	12 December 2018, 09:41:26 UTC
6f0acf0	Michael Ringgaard	10 December 2018, 16:44:41 UTC	Split kernel (#311)	10 December 2018, 16:44:41 UTC
7483c02	Michael Ringgaard	10 December 2018, 11:57:05 UTC	Handle Wikidata sense ids in converter (#310)	10 December 2018, 11:57:05 UTC
7b0690e	Michael Ringgaard	07 December 2018, 14:15:25 UTC	Prepare for google3 import (#308)	07 December 2018, 14:15:25 UTC
cd25f83	Michael Ringgaard	06 December 2018, 16:07:41 UTC	Fix bug in taxonomy (#307)	06 December 2018, 16:07:41 UTC
d51bac6	Michael Ringgaard	06 December 2018, 11:15:11 UTC	Consolidate reduction ops (#306)	06 December 2018, 11:15:11 UTC
d5fe5bd	Michael Ringgaard	05 December 2018, 20:44:54 UTC	Myelin test suite and bug fixes (#305)	05 December 2018, 20:44:54 UTC
666befd	rahul1980	04 December 2018, 18:35:29 UTC	Use Python API for frame evaluation (#304) Bonus: Since the Python API doesn't need to read the common store from a file, we can get rid of commons_path at various places.	04 December 2018, 18:35:29 UTC
523f73f	Michael Ringgaard	04 December 2018, 12:18:06 UTC	Python API for Myelin (#302)	04 December 2018, 12:18:06 UTC
756bebf	Michael Ringgaard	03 December 2018, 08:59:01 UTC	Python iterator for RecordDatabase (#303)	03 December 2018, 08:59:01 UTC
db39e09	Michael Ringgaard	30 November 2018, 13:30:05 UTC	Myelin fixes to make the new parser run on GPU (#299)	30 November 2018, 13:30:05 UTC