b735226 | André Santos | 30 May 2021, 17:36:12 UTC | Extract jobs from Manager to CurrentJobs class | 30 May 2021, 17:36:12 UTC |
7a9108a | André Santos | 29 May 2021, 18:42:09 UTC | Update readme | 29 May 2021, 18:42:09 UTC |
a7b92b9 | André Santos | 29 May 2021, 18:41:07 UTC | Add README | 29 May 2021, 18:41:07 UTC |
42a8caa | André Santos | 29 May 2021, 18:18:24 UTC | Cancel domain crawl when it times out Manager emits event, ManagerPubSub sends it to Redis. WorkerPubSub add it to a list of canceled jobs, and Worker stops iterating over the resources. | 29 May 2021, 18:18:24 UTC |
3c87d64 | André Santos | 28 May 2021, 20:11:37 UTC | Fix bug in job timeout, canceling and postponing | 28 May 2021, 20:11:37 UTC |
4ef96d6 | André Santos | 27 May 2021, 20:03:28 UTC | minor | 27 May 2021, 20:05:59 UTC |
a7c4476 | André Santos | 27 May 2021, 19:44:42 UTC | Read seeds from data/seeds.txt file | 27 May 2021, 19:44:42 UTC |
d870039 | André Santos | 27 May 2021, 19:37:41 UTC | Worker ids are now UUID instead of PID | 27 May 2021, 19:43:43 UTC |
7c570f6 | André Santos | 27 May 2021, 11:05:18 UTC | move stuff around | 27 May 2021, 11:05:18 UTC |
6907b00 | André Santos | 25 May 2021, 21:12:20 UTC | Docker! | 25 May 2021, 21:12:20 UTC |
095e026 | André Santos | 25 May 2021, 20:18:53 UTC | Fix stuff | 25 May 2021, 21:11:57 UTC |
a89642b | André Santos | 25 May 2021, 19:52:14 UTC | Preparing for docker | 25 May 2021, 19:52:14 UTC |
1bc0a32 | André Santos | 20 May 2021, 12:10:00 UTC | minor | 20 May 2021, 12:10:00 UTC |
e6e8690 | André Santos | 20 May 2021, 12:09:39 UTC | Remove debug prints | 20 May 2021, 12:09:39 UTC |
93138bb | André Santos | 20 May 2021, 12:04:02 UTC | Fix bug removing www from URLs `mongoose-type-url` uses `normalize-url`, which by default removes some stuff. replaced with custom type using String and a validator function which uses `try { new URL(url) }` | 20 May 2021, 12:04:02 UTC |
9743031 | André Santos | 20 May 2021, 11:51:19 UTC | Add url validation library. Add tests | 20 May 2021, 11:51:19 UTC |
50398f5 | André Santos | 20 May 2021, 09:46:37 UTC | Start fixing crawling with wikidata seeds | 20 May 2021, 09:46:37 UTC |
0437b8a | André Santos | 20 May 2021, 09:38:36 UTC | Fix path in worker-pool script | 20 May 2021, 09:38:36 UTC |
cbdba03 | André Santos | 19 May 2021, 21:39:15 UTC | Manager and worker starting ok | 20 May 2021, 09:17:40 UTC |
b6d8367 | André Santos | 19 May 2021, 19:47:35 UTC | Fixing worker | 19 May 2021, 19:47:35 UTC |
3050a8d | André Santos | 19 May 2021, 19:28:50 UTC | split package.json | 19 May 2021, 19:28:50 UTC |
b1ab942 | André Santos | 19 May 2021, 18:56:46 UTC | more | 19 May 2021, 18:56:46 UTC |
696cce2 | André Santos | 19 May 2021, 18:49:04 UTC | move stuff into separate folders | 19 May 2021, 18:49:04 UTC |
2faa3be | André Santos | 19 May 2021, 18:43:10 UTC | Register and deregister jobs in Manager | 19 May 2021, 18:43:10 UTC |
6c1ad0d | André Santos | 19 May 2021, 18:41:21 UTC | Fix bug in cheerio css selector | 19 May 2021, 18:41:21 UTC |
2943b46 | André Santos | 18 May 2021, 12:55:52 UTC | Make Manager a es6 class | 18 May 2021, 12:55:52 UTC |
2daa6f4 | André Santos | 18 May 2021, 12:26:41 UTC | Remove debug prints | 18 May 2021, 12:26:41 UTC |
f832984 | André Santos | 13 May 2021, 20:41:35 UTC | ups db | 13 May 2021, 20:41:35 UTC |
db4a20a | André Santos | 13 May 2021, 20:36:25 UTC | Mongodb connection URI | 13 May 2021, 20:36:25 UTC |
9e6a569 | André Santos | 08 May 2021, 00:50:23 UTC | maybe fixed pathHeads and headCount | 08 May 2021, 00:50:23 UTC |
ba568a5 | André Santos | 07 May 2021, 19:52:06 UTC | pathHeads | 07 May 2021, 19:52:06 UTC |
9e0a511 | André Santos | 06 May 2021, 22:17:29 UTC | A LOT OF STUFF | 06 May 2021, 22:17:29 UTC |
821ffa3 | André Santos | 01 May 2021, 19:59:46 UTC | Stuff | 01 May 2021, 19:59:46 UTC |
65521d7 | André Santos | 16 April 2021, 12:12:31 UTC | Mark paths as active/finished/disabled | 16 April 2021, 12:24:30 UTC |
047db63 | André Santos | 15 April 2021, 16:51:12 UTC | LCB | 15 April 2021, 16:51:12 UTC |
6118288 | André Santos | 15 April 2021, 15:50:53 UTC | Building paths seems ok | 15 April 2021, 15:50:53 UTC |
23f00f0 | André Santos | 14 April 2021, 23:02:28 UTC | Inserting triples | 14 April 2021, 23:02:28 UTC |
3212407 | André Santos | 13 April 2021, 22:55:22 UTC | Crawling domains | 13 April 2021, 22:55:22 UTC |
b4e0e53 | André Santos | 13 April 2021, 16:50:15 UTC | kickoff: init function working | 13 April 2021, 16:50:22 UTC |
4e959b9 | André Santos | 05 April 2021, 10:41:06 UTC | Stuff | 05 April 2021, 10:43:33 UTC |
3bc35e4 | André Santos | 11 March 2021, 18:14:46 UTC | Simplify delay function | 11 March 2021, 18:14:46 UTC |
0337eaa | André Santos | 11 March 2021, 11:35:59 UTC | Dump triples | 11 March 2021, 11:35:59 UTC |
4e477d6 | André Santos | 11 March 2021, 11:32:42 UTC | Dump triples | 11 March 2021, 11:32:42 UTC |
1024f3f | André Santos | 10 March 2021, 20:13:28 UTC | Calc request interval histogram and more stuff | 10 March 2021, 20:13:28 UTC |
945147d | André Santos | 09 March 2021, 15:58:34 UTC | Parse <link> tags in HTML files | 09 March 2021, 15:58:34 UTC |
0e35325 | André Santos | 04 March 2021, 11:29:54 UTC | Add triples from crawlDomain | 04 March 2021, 11:29:54 UTC |
134329a | André Santos | 02 March 2021, 18:09:33 UTC | Validator for URLs in models | 02 March 2021, 18:09:33 UTC |
eaa7b92 | André Santos | 02 March 2021, 16:26:59 UTC | Split pubsub methods from Manager | 02 March 2021, 16:26:59 UTC |
7fd9eeb | André Santos | 02 March 2021, 12:44:06 UTC | Split pubsub code from Worker to WorkerPubSub | 02 March 2021, 12:44:06 UTC |
047dcd7 | André Santos | 02 March 2021, 11:25:09 UTC | Model.upsertMany for Resources and Domains | 02 March 2021, 11:25:09 UTC |
f8f976a | André Santos | 01 March 2021, 17:42:23 UTC | Log axios with winston | 01 March 2021, 17:42:23 UTC |
a15b899 | André Santos | 01 March 2021, 12:23:08 UTC | Improve logging | 01 March 2021, 12:23:08 UTC |
b660494 | André Santos | 26 February 2021, 17:56:08 UTC | domain crawl implemented in worker | 26 February 2021, 17:56:08 UTC |
e106cc2 | André Santos | 26 February 2021, 17:54:53 UTC | Db config | 26 February 2021, 17:54:53 UTC |
19d9f6a | André Santos | 25 February 2021, 19:44:52 UTC | domainCheck working | 25 February 2021, 19:44:52 UTC |
5c824aa | André Santos | 23 February 2021, 18:17:02 UTC | assignJobs published 1 job at a time | 23 February 2021, 18:17:02 UTC |
182db16 | André Santos | 23 February 2021, 17:37:23 UTC | Stuff | 23 February 2021, 17:37:23 UTC |
35690fe | André Santos | 23 February 2021, 17:36:58 UTC | Script for clearing db | 23 February 2021, 17:36:58 UTC |
56a3c91 | André Santos | 23 February 2021, 17:36:36 UTC | Script for adding project | 23 February 2021, 17:36:36 UTC |
13e935e | André Santos | 23 February 2021, 17:36:16 UTC | Add bluebird | 23 February 2021, 17:36:27 UTC |
0e7c43e | André Santos | 23 February 2021, 17:35:48 UTC | Make worker pool kill workers on exit | 23 February 2021, 17:35:48 UTC |
68bca0e | André Santos | 18 February 2021, 09:44:02 UTC | Add project, checkDomain | 18 February 2021, 09:44:02 UTC |
eaee1cc | André Santos | 16 February 2021, 23:20:01 UTC | Save domain check | 16 February 2021, 23:20:22 UTC |
4e62484 | André Santos | 16 February 2021, 18:04:41 UTC | Domain check | 16 February 2021, 18:04:41 UTC |
447fb3b | André Santos | 16 February 2021, 15:08:32 UTC | stuff | 16 February 2021, 15:08:32 UTC |
af75008 | André Santos | 15 February 2021, 17:05:21 UTC | stuff | 15 February 2021, 17:05:21 UTC |
58a994f | André Santos | 10 February 2021, 22:16:07 UTC | Resource.getNext to pop next resource to crawl | 10 February 2021, 22:16:07 UTC |
0cea238 | André Santos | 10 February 2021, 21:27:56 UTC | kickoff | 10 February 2021, 21:27:56 UTC |