034aba1 | Douglas Thain | 29 August 2024, 19:59:52 UTC | Doc: Fix Build Namespace and Broken Links (#3925) * - Modify doc build and test instructions to match how readthedocs does it. - Update pymdown snippets to allow inclusion outside of manual root. - Update pymdown snippets to fail on build if snippet can't be found. * Remove symlink to code-examples, make examples refer to examples in source dir directly. * Fix up broken internal references. | 29 August 2024, 19:59:52 UTC |
7517035 | Douglas Thain | 23 August 2024, 15:41:04 UTC | newline in help | 23 August 2024, 15:41:31 UTC |
3e1633d | Colin Thomas | 22 August 2024, 17:26:11 UTC | set resources in factories (#3924) | 22 August 2024, 17:26:11 UTC |
0ec5c00 | Benjamin Tovar | 16 August 2024, 18:09:50 UTC | vine: do not use temp for target constant keys in dask (#3918) | 16 August 2024, 18:09:50 UTC |
f5b2d62 | Douglas Thain | 16 August 2024, 16:04:16 UTC | These are unit and regression tests, not integration tests! | 16 August 2024, 16:04:54 UTC |
c4da655 | Colin Thomas | 15 August 2024, 18:56:26 UTC | user resource specifications are absolute (#3914) | 15 August 2024, 18:56:26 UTC |
a0d692c | Colin Thomas | 14 August 2024, 17:34:44 UTC | vine: default worker disk to half avaialble (#3916) * default worker disk to half avaialble * comment | 14 August 2024, 17:34:44 UTC |
c61c140 | JinZhou5042 | 09 August 2024, 18:12:54 UTC | vine: temporarily remove resource check for function calls (#3913) * vine: remove resource check for function calls * add comment * fix comment | 09 August 2024, 18:12:54 UTC |
386c1bd | Colin Thomas | 08 August 2024, 20:43:34 UTC | vine: remove disk check for library task (#3909) * do not measure disk for function calls * format | 08 August 2024, 20:43:34 UTC |
3a96add | Barry Sly-Delgado | 07 August 2024, 17:58:54 UTC | dont catch import error when creating library (#3907) Co-authored-by: Barry Jay Sly-Delgado <slydelgado1@czvnc5.llnl.gov> | 07 August 2024, 17:58:54 UTC |
5624948 | Douglas Thain | 07 August 2024, 17:51:45 UTC | Doc: Update MD to match M4 (#3908) * Fix minor error in README. * Commit .md files generated by .m4 files, so that they are incorporated into readthedocs properly. | 07 August 2024, 17:51:45 UTC |
15c0b00 | Douglas Thain | 07 August 2024, 15:27:00 UTC | Vine: Function Slots Default to Number of Cores (#3904) * Allow a LibraryTask to automatically set the number of function slots to the number of cores allocated to the task. This will happen by default unless the user calls vine_task_set_function_slots(). To do this requires separating task->function_slots_requested from task->function_slots_total. The latter is now assigned at task-commit-time. * Update docs to reflect that set_function_slots is optional. * Update serverless test to use a large worker, then let library fill available space. | 07 August 2024, 15:27:00 UTC |
35b3411 | JinZhou5042 | 07 August 2024, 14:24:01 UTC | rename import_modules to hoisting_modules (#3903) | 07 August 2024, 14:24:01 UTC |
7348e3e | JinZhou5042 | 07 August 2024, 13:11:42 UTC | vine: deserialize argument infile before forking (#3902) | 07 August 2024, 13:11:42 UTC |
dc3a2cd | Douglas Thain | 06 August 2024, 14:42:00 UTC | Batch: Rework Interface Into Queue, Task, File Objects (#3898) * Rework batch interface to accept batch_job_submit(struct batch_task *t). This will allow us to pass properly formed file objects and make use of more features of taskvine from makeflow and the factory. * Makeflow now uses batch_job_submit(queue,task) as promised long ago. * Blue Waters was decomissioned in 2023. * Remove largely unused batch_fs operations. * Clean up batch_queue structures. * Revert batch_fs operations back to basic unix operations. * Remove unused functions, format for consistency. * Separate batch_job_info into a separate file. * Use batch_queue consistently to describe a queue. * batch_task -> batch_job * Update copyright boilerplate * Update batch_queue_cluster.c to new interface. * Update batch_queue_amazon.c to new interface. * Add back in batch_queue_amazon and batch_queue_cluster. * Replace throughout: batch_job -> batch_queue batch_task -> batch_job * Remove mesos from work_queue_factory. * Remove mesos support. * Remove MPI from makeflow. * Remove unused makeflow-mpi support. * Fix up batch_queue_k8s by transforming file lists to strings in batch_queue_k8s_submit. * Fix incorrect ordering of inner file / outer file. * batch_fs_unlink is replaced by unlink_recursive, not just unlink (caught by TR_makeflow_restart.sh) * Produce a factory log in test. * Do not attach worker log as output unless debugging is requested. * Move retired modules to old * Clean up makefile to incorporate new headers. * Complete autodocs for batch_job/file/queue/wrapper * Remove amazon_batch, lambda, and mesos options from makeflow. * Remove Amazon Batch, Lambda, Mesos, and MPI from man pages. * Remove Amazon Batch, Lambda, Mesos, MPI from manual. * explicitely -> explicitly * Remove unused chirp symbol * Make autodocs example complete with input file. * Remove unused items from makeflow man page. * Reorder and doc batch system types. * General renaming of sge to uge in code and manuals. sge is retained as an alias for uge in batch queue selection. * sge -> uge in vine_submit_workers * sge -> uge * format and lint batch_job module * Apply clang formatting rules. * Add "experimental" message for batch modules in which we have lower confidence. * queue -> remote_queue * Fix up condor afs error message * experimental flag should be a feature not an option | 06 August 2024, 14:42:00 UTC |
c92bbd3 | Douglas Thain | 05 August 2024, 20:52:58 UTC | Makeflow: Man page bits for SSL. (#3896) * Add missing man page entries for makeflow ssl * ssl-key and ssl-cert * ssl-cert and ssl-key | 05 August 2024, 20:52:58 UTC |
f3616b5 | Douglas Thain | 05 August 2024, 20:19:47 UTC | Remove taskvine_json and vine_api_proxy from C side of things. (#3897) We are going to go a different way and interface from Python. | 05 August 2024, 20:19:47 UTC |
7c0510b | Barry Sly-Delgado | 05 August 2024, 13:13:03 UTC | Futures stream output (#3899) * refactory result retrieval * fetch file for FuturePythonTask * update file * change task creation * fix function call creation * funcalls manager name --------- Co-authored-by: Barry Jay Sly-Delgado <slydelgado1@czvnc5.llnl.gov> | 05 August 2024, 13:13:03 UTC |
ac9f0b5 | Colin Thomas | 31 July 2024, 17:55:15 UTC | fix task disk measurement (#3895) | 31 July 2024, 17:55:15 UTC |
4b43490 | Barry Sly-Delgado | 31 July 2024, 12:58:49 UTC | Vine: add wait and as_completed for futures (#3876) * adds wait and as_completed * call super and add test * fix test * use futures not tasks * tpyo * raise exeception and fix recursion --------- Co-authored-by: Barry Jay Sly-Delgado <slydelgado1@czvnc5.llnl.gov> | 31 July 2024, 12:58:49 UTC |
06531dc | JinZhou5042 | 30 July 2024, 15:47:21 UTC | vine: fix useless recovery tasks submission (#3891) * vine: fix useless recovery tasks submission * variable name change: found -> round_replication_count * comment fix | 30 July 2024, 15:47:21 UTC |
7c0bc51 | JinZhou5042 | 23 July 2024, 14:50:22 UTC | vine: recover lost temporary files on worker removal (#3887) * vine: recover lost temprory files on worker removal * set default temp_replica_count to 1 | 23 July 2024, 14:50:22 UTC |
1a1880a | JinZhou5042 | 20 July 2024, 18:03:37 UTC | vine: submit all potential recovery tasks for one task in one go (#3885) * vine: submit all recovery tasks for one task at once * vine: submit all recovery tasks for one task at once | 20 July 2024, 18:03:37 UTC |
865b17d | JinZhou5042 | 18 July 2024, 15:49:22 UTC | vine: stage out pythontask input properly (#3884) | 18 July 2024, 15:49:22 UTC |
108ba87 | JinZhou5042 | 18 July 2024, 13:44:04 UTC | vine: prune stale files in daskvine (#3880) * prune stale files in daskvine * merge prune to declare * minor fix * minor fix * minor fix * wrap daskvine._prune_cluster * add option prune_cluster to daskvine manager * use prune_files instinstead of prune_cluster | 18 July 2024, 13:44:04 UTC |
58140aa | Benjamin Tovar | 17 July 2024, 14:12:37 UTC | fix calloc arguments, batch_job_dryrun (#3881) | 17 July 2024, 14:12:37 UTC |
4665a7f | Douglas Thain | 10 July 2024, 14:36:34 UTC | Better error handling for GPU detection. (#3879) * Update GPU detection to check exit status of nvidia-smi command. Otherwise this can result in the perceived detection of a GPU called "Failed to initialize NVML". * format | 10 July 2024, 14:36:34 UTC |
20e9257 | Colin Thomas | 09 July 2024, 16:25:32 UTC | vine: sum disk usage to avoid overfilling worker allocation (#3877) * disk accounting * clean up debug * account for disk inuse * basic allocation estimate * do not modify rmsummary if no input files * semantics | 09 July 2024, 16:25:32 UTC |
4fbdc30 | Benjamin Tovar | 03 July 2024, 18:34:39 UTC | disable centos7 (#3878) | 03 July 2024, 18:34:39 UTC |
a346eb0 | Douglas Thain | 25 June 2024, 14:19:52 UTC | Use vine_set_framework to indicate when makeflow is using taskvine. (#3875) | 25 June 2024, 14:19:52 UTC |
4ce19d2 | Douglas Thain | 24 June 2024, 17:34:16 UTC | Makeflow: Add SSL Key/Cert for WQ/Vine (#3874) * - Added support for ssl_key and ssl_cert in batch_job interface. - batch_job_{vine|work_queue}_create now use ssl in constructor. - Added Makeflow options to set ssl key and cert. * Added ssl key/cert to manual. | 24 June 2024, 17:34:16 UTC |
112480d | Douglas Thain | 24 June 2024, 15:44:35 UTC | Chirp: Cleanup code and remove some dependencies. (#3873) * Move MQ source and tests into devel/mq * Add jx_parse_string_length for dealing with non-C strings * Move confuga and related components to devel. * Move confuga files to devel * Remove embedded sqlite3. * Disconnect confuga and old json library. * Remove batch_job_chirp * Remove unused mention of old json module. * Added jx_parse_string_and_length * Use chirp_jobid_t as argument to chirp_job API instead of json generally. * commentary * move old tests * format * Add missing chirp_job.h | 24 June 2024, 15:44:35 UTC |
31cd6aa | Douglas Thain | 21 June 2024, 17:41:08 UTC | Format: Fix Crazy Line Breaks (#3872) * Format: Keep escaped newlines close to line. * Format: Remove column limit so that logically related things stay on the same line. * format | 21 June 2024, 17:41:08 UTC |
5f504be | Douglas Thain | 21 June 2024, 16:54:02 UTC | Port: Patches to Build on FreeBSD (#3871) * Applied patches to generate PR. * clean up freebsd fixes * format * Rely on -fPIC coming from configure. Only build chirp_swig_wrap as part of bindings. * Do not sort includes -- in some cases order matters. | 21 June 2024, 16:54:02 UTC |
18b8f3e | Douglas Thain | 21 June 2024, 13:56:46 UTC | Chirp: Use openssl pkeyutl instead of rsautl (#3869) * Switch auth_ticket to use openssl pkeyutl instead of rsautl, which is deprecated. Reworked auth_ticket_accept to be more idomatic cctools style. * Use struct list instead of a dynamic array. * Just iterate over the list. * Just use strtok directly. * Uncomment cleanup from debugging. * format * Check for pkeyutl in configure * Pass only signature file to openssl rsautl * format * comment details of pkeyutl * Mark conditional bits as clang-format off to satisfy linter. | 21 June 2024, 13:56:46 UTC |
95e8a8b | Douglas Thain | 20 June 2024, 19:43:25 UTC | List all authentication methods in man pages. (#3870) List all authentication methods in header docs. | 20 June 2024, 19:43:25 UTC |
e146784 | JinZhou5042 | 18 June 2024, 15:41:03 UTC | vine: assign size to t->size in stage_output_file (#3867) * assign size to t->size in stage_output_file * format * lint issue | 18 June 2024, 15:41:03 UTC |
925bd27 | JinZhou5042 | 17 June 2024, 19:42:44 UTC | vine: consistently use int64 in copy_file_to_file (#3863) * consistently use int64 in copy_file_to_file * use int for opening fd * use int64 in putfile * use int64 for batch_fs_putfile * use int64 in various putfile * check success < 0 instead of <= 0 for copy_file_to_file | 17 June 2024, 19:42:44 UTC |
59ddda9 | Douglas Thain | 12 June 2024, 18:55:14 UTC | Vine: Keep Scheduler Stateless for Libraries/Functions (#3857) * The scheduler should *not* modify the state of the system in check_worker_against_task. Instead, when scheduling a function call, look for compatibility and capacity for library tasks, and then dispatch them in commit_task_to_worker. * format | 12 June 2024, 18:55:14 UTC |
391ff0b | Douglas Thain | 12 June 2024, 18:54:57 UTC | Factory: Check protocol version before submitting workers. (#3861) * Both work_queue and vine_manager now report protocol versions to the catalog, which is checked by the factory, so as to avoid submitting incompatible workers. * format | 12 June 2024, 18:54:57 UTC |
8d1a285 | Douglas Thain | 12 June 2024, 18:33:12 UTC | Give C compile command relative to CONDA_PREFIX (#3860) | 12 June 2024, 18:33:12 UTC |
325160c | Douglas Thain | 12 June 2024, 17:16:23 UTC | Build: M1 (#3859) * Add macos-14 (implies m1) to build. * - Distinguish between conda and native builds. - Explicitly label macos-14 as macos-14-arm64 (which it aliases anyway) | 12 June 2024, 17:16:23 UTC |
0179753 | JinZhou5042 | 11 June 2024, 13:45:48 UTC | vine: handle library failures (#3855) | 11 June 2024, 13:45:48 UTC |
f4dceb8 | Barry Sly-Delgado | 10 June 2024, 16:44:22 UTC | Vine: Buffer Asynchronous Messages From Worker for Async Task Completion (#3669) * base async worker buffer * convert some messages to use async * async task complete messages * format and remove asm * make printf happy * fixes * base task handling * working light task handling * light task dask option * light changes * more changes * fix * test fix * lint * task_id type fix * forsaken fix * terminate output string on worker side * split workers with available results * format * format * remove forsaken return * requested changes * stat failure handling * update task clean * fix completed task handling * re-add warning for short running failed taks * re-add forsaken temp fix --------- Co-authored-by: Benjamin Tovar <btovar@nd.edu> | 10 June 2024, 16:44:22 UTC |
e3832bd | Barry Sly-Delgado | 07 June 2024, 18:46:04 UTC | VINE: add miscellaneous plotting tools (#3852) * add tools * update doc * name change | 07 June 2024, 18:46:04 UTC |
0e7f46c | Benjamin Tovar | 04 June 2024, 14:33:03 UTC | vine: compute hash only when cache level requires it (#3853) * vine: compute hash only when cache level requires it * Revert "vine: compute hash only when cache level requires it" This reverts commit bfbbd5ac8ff6803021a1e9b8fe1024938883c811. * compute hash name when using worker level cache | 04 June 2024, 14:33:03 UTC |
a6f9609 | JinZhou5042 | 31 May 2024, 12:04:36 UTC | vine: combine imports when calculating function hash (#3837) * vine: combine imports and remove comments when calculating function hash * remove regexs, represent source_code as a list * remove unrelated * modify generate_functions_hash * remove unrelated one --------- Co-authored-by: Benjamin Tovar <btovar@nd.edu> | 31 May 2024, 12:04:36 UTC |
ea9c71b | Benjamin Tovar | 30 May 2024, 15:33:32 UTC | Clean up swig bindings (#3847) * remove non taskvine.h files from .i * taskvine version through taskvine.h * runtime dirs to taskvine.h * move needed file functions to taskvine.h * fix task resources * check for nulls | 30 May 2024, 15:33:32 UTC |
dcb514a | Benjamin Tovar | 30 May 2024, 15:32:45 UTC | vine: correctly account for time/stats in no_wait (#3849) | 30 May 2024, 15:32:45 UTC |
c04f589 | JinZhou5042 | 30 May 2024, 14:04:41 UTC | vine: check if the library exists when submitting FunctionCall tasks (#3832) * vine: name check for function calls * fix for future funcalls * remove test print * pass function_list as an arg * lint issue * minor fix * minor fix * minor issue * shift python apis to C site * lint issue * fix * remove redundant code * lint issue * remove bookkeeping * lint * add manager.h to construct bindings * move vine_manager_find_library_template to taskvine.i * comment | 30 May 2024, 14:04:41 UTC |
a59f501 | Douglas Thain | 30 May 2024, 13:59:49 UTC | Catalog: Make Tests More Robust (#3850) * Add ssl_port option to catalog server to allow for more flexible testing. * - Allow catalog tests to choose arbitrary ports. - Wait for catalog server to start properly before continuing. - Kill server with SIGTERM to allow for normal cleanup. * Remove incomplete update_port_file * Run regular test and ssl test on distinct port ranges | 30 May 2024, 13:59:49 UTC |
d2b738c | Douglas Thain | 29 May 2024, 16:38:43 UTC | Fix mismatched quote. | 29 May 2024, 16:39:10 UTC |
be20082 | Benjamin Tovar | 29 May 2024, 16:14:14 UTC | http to https makeflow examples (#3848) | 29 May 2024, 16:14:14 UTC |
d0a5423 | Benjamin Tovar | 21 May 2024, 17:11:20 UTC | activate almalinux9 (#3842) * move all to almalinux, activate almalinux 9 * github almalinux 9 * specific version cvmfs-devel to get libcvmfs.a * include threadpoolctl * cleanup almalinux8 | 21 May 2024, 17:11:20 UTC |
4ce7ce6 | Douglas Thain | 21 May 2024, 13:50:33 UTC | Tighten up PR template. | 21 May 2024, 13:50:46 UTC |
cd3b4de | Douglas Thain | 20 May 2024, 16:06:35 UTC | Infra: Update GitHub actions to v4 (#3841) * Update actions to v4 to squelch github complaints. * Go back to v3 for centos7. * apparently comments aren't allowed in yaml * tab * spaces | 20 May 2024, 16:06:35 UTC |
7af6625 | Greg Pauloski | 17 May 2024, 14:05:22 UTC | vine: fix doc typos, code typos, and callback signatures for `FuturesExecutor` (#3836) * Fix typos and callback signatures in FuturesExecutor * Fix typos in FuturesExecutor examples | 17 May 2024, 14:05:22 UTC |
9305098 | Benjamin Tovar | 10 May 2024, 14:51:47 UTC | vine: dask executor worker transfers segfault fix (#3831) * vine: tasks and funcalls behave the same wrt undeclaring files The caller should be in charge of undeclaring outputs. * vine: renamed lazy_transfers to worker_transfers * add missing comma * remove debug statement * fix typo | 10 May 2024, 14:51:47 UTC |
74a824a | Benjamin Tovar | 10 May 2024, 01:29:55 UTC | vine: temporary, release worker that failed to xfer input file (#3818) * add xfer streak counters * need to keep source and destination counts separate * simplify, blocking gave interesting interactions * format | 10 May 2024, 01:29:55 UTC |
f58f3eb | Benjamin Tovar | 10 May 2024, 01:28:01 UTC | vine: release lib tasks (#3824) * vine: remove lib tasks from retrieved list when using tags * vine: add comment del lib instances when done * fix leak of resources in vine_task_copy * delete lists before copy * vine: do not copy template unless it is needed. check_worker_against_task does not modify the target task, thus we can avoid copying over and over for workers that can't fit the library. * vine: copy directly to list in vine_task_copy * fix typo * format * add dropped function_slots * do not use buffers with function calls They are incompatible with the dask vine executor checkpoints. Need to rethink the combination of buffer and temporary outputs. | 10 May 2024, 01:28:01 UTC |
f949c75 | Benjamin Tovar | 08 May 2024, 18:57:34 UTC | vine: fn call write input to disk if output survives task (#3821) otherwise it keeps using the buffer as the file is never vine_file_delete'd | 08 May 2024, 18:57:34 UTC |
dab2ec6 | Benjamin Tovar | 08 May 2024, 18:50:17 UTC | vine: Set library types (#3823) * vine: set instance and template on lib task creation * do not check lib tasks for recovery task creation * format | 08 May 2024, 18:50:17 UTC |
c9ae25c | Benjamin Tovar | 08 May 2024, 17:51:29 UTC | vine: do not use rampdown heuristic by default (dask executor) (#3822) It does not play well with overcommitment. | 08 May 2024, 17:51:29 UTC |
e924b66 | Benjamin Tovar | 08 May 2024, 17:37:27 UTC | vine: add wrapper to funcalls in dask executor (#3820) Simplify by moving wrapping code into execute vertex. | 08 May 2024, 17:37:27 UTC |
4c93727 | Benjamin Tovar | 07 May 2024, 23:19:22 UTC | vine: temporary change: do not fail task if input file changed (#3817) coffea casa touches token files, which modifies file times withouth modifying the contents. | 07 May 2024, 23:19:22 UTC |
f869fdc | Benjamin Tovar | 07 May 2024, 23:18:44 UTC | vine: Wait no wait (#3815) * vine: wait_no_wait get retrieved tasks without doing any more work * adds doc strings * format * timeout 0 means no hang * simplify dask executor * update manager bindings * wait->wait_for_tag * fix tag | 07 May 2024, 23:18:44 UTC |
f9760f6 | Benjamin Tovar | 07 May 2024, 16:03:47 UTC | Daskvine instrument (#3812) * vine: dask executor add extra fns to library * add wrapper to task * fix example * vine: library name per dag * ensure wrapper result exists * format | 07 May 2024, 16:03:47 UTC |
2ac98f6 | Barry Sly-Delgado | 07 May 2024, 15:56:42 UTC | change LIBRARY to TEMPLATE and INSTANCE (#3814) * change LIBRARY to TEMPLATE and INSTANCE * format --------- Co-authored-by: Benjamin Tovar <btovar@nd.edu> | 07 May 2024, 15:56:42 UTC |
6e3a805 | Benjamin Tovar | 07 May 2024, 14:20:02 UTC | vine: do not hash functions with @monitored decorator (#3816) * vine: do not hash functions with @monitored decorator * format | 07 May 2024, 14:20:02 UTC |
1b79828 | Benjamin Tovar | 06 May 2024, 15:01:36 UTC | library_cleanup (#3808) Co-authored-by: Colin Thomas <cthomas0687@gmail.com> | 06 May 2024, 15:01:36 UTC |
d94a1a2 | Chris Boumalhab | 04 May 2024, 16:06:46 UTC | small enhancement (#3813) Co-authored-by: Chris Boumalhab <cboumalh@crcfe01.crc.nd.edu> | 04 May 2024, 16:06:46 UTC |
9dbce11 | Chris Boumalhab | 02 May 2024, 17:21:34 UTC | Vine: Load in data to cache from the shared filesystem (#3756) * testing things out * linting * added a free * function name change * PR comment fixes * use string_format * removed const * fixed const and frees * add const * added const * lint * changes from PR comments * Update vine_file.c --------- Co-authored-by: Chris Boumalhab <cboumalh@cclws17.cse.nd.edu> Co-authored-by: Benjamin Tovar <btovar@nd.edu> | 02 May 2024, 17:21:34 UTC |
ec3e6d6 | Benjamin Tovar | 02 May 2024, 15:43:57 UTC | vine: minor dask vine fixes (#3809) | 02 May 2024, 15:43:57 UTC |
d83e98a | Benjamin Tovar | 02 May 2024, 15:04:32 UTC | vine: limit temp replica requests per cycle (#3776) * vine: limit temp replica requests per cycle Consider only q->attemp_schedule_depth temp files when requesting replicas. * add tune doc * fix strncpy | 02 May 2024, 15:04:32 UTC |
c975674 | Thanh Son Phung | 02 May 2024, 14:49:42 UTC | vine: change `clone` to `addref` (#3799) * fix * refadded object to referenced object, etc. --------- Co-authored-by: Benjamin Tovar <btovar@nd.edu> | 02 May 2024, 14:49:42 UTC |
433ee4c | Benjamin Tovar | 02 May 2024, 14:47:29 UTC | vine: function call exit code (#3803) * lib code: send exit status of fcall * vine: function calls get an exit code * vine: different exit codes in fork method * vine: lint * vine: remove rand testing statement * format * vine: remove another debug | 02 May 2024, 14:47:29 UTC |
c25c625 | Thanh Son Phung | 02 May 2024, 14:40:11 UTC | fix (#3796) | 02 May 2024, 14:40:11 UTC |
bd0e70f | Benjamin Tovar | 02 May 2024, 14:12:10 UTC | vine: set a limit on the number of tasks submitted to manager (#3807) With graphs which first layer is very large, this is useful as the manager does not have to wait for all tasks to be submitted before starting to do some work, that is, this helps overlap task submission with execution. | 02 May 2024, 14:12:10 UTC |
9b9bd3c | Benjamin Tovar | 02 May 2024, 00:20:13 UTC | vine: task library to follow set/get convention. (#3794) * vine: task library to follow set/get convention. Also fixes a bug where if a task both required and provided a library, the scheduler would go into an infinite loop. * fix args | 02 May 2024, 00:20:13 UTC |
891d572 | Benjamin Tovar | 02 May 2024, 00:16:29 UTC | vine: send library memleak (#3791) * vine: checking lib matches should not modify task task modifications should be done on commit. * vine: ensure library is only sent once to a worker * remove debug msg * vine: do not duplicate library if not needed * vine: ensure deletion of copies of library tasks * vine: allow multiple libraries of same type at worker * delete only twice library tasks * Revert "vine: do not duplicate library if not needed" This reverts commit fa8682212136ae696f054fb871f2a11b0d0cc462. * add comments | 02 May 2024, 00:16:29 UTC |
d115c3e | Benjamin Tovar | 02 May 2024, 00:15:42 UTC | vine: dask add progress bar (#3801) * vine: dask add progress bar * progress bar really optional | 02 May 2024, 00:15:42 UTC |
59e59b8 | Benjamin Tovar | 02 May 2024, 00:14:15 UTC | vine: manager check temps (#3802) * vine: remove unused transient result failure from enum * vine: bug: only last result of fetch output was kept * vine: set output missing result only once per task * vine: error if temp file not created as output * vine: do not generate output that cloudpickle can't decode * vine: return status 1 if call no good * vine: vine_sandbox return value didnt matter * add comment warning | 02 May 2024, 00:14:15 UTC |
03a03f1 | Benjamin Tovar | 02 May 2024, 00:10:49 UTC | Daskvine fix repeated keys (#3805) * vine: avoid running keys twice Seldom on init, some keys would get schedule for execution twice. Changed from list to dictionaries to eliminate repetitions. * format | 02 May 2024, 00:10:49 UTC |
0d3c115 | JinZhou5042 | 30 April 2024, 17:12:34 UTC | vine: threadpoolctl for function calls (#3772) * vine: threadpoolctl for function calls * lint issue | 30 April 2024, 17:12:34 UTC |
4a9b308 | Chris Boumalhab | 30 April 2024, 17:09:09 UTC | Condor: Chaos Monkey (#3774) * added chaos monkey script * set up done * move tuple argument * change prog name in argpasrse * Update condor_chaos_monkey --------- Co-authored-by: Chris Boumalhab <cboumalh@cclws17.cse.nd.edu> Co-authored-by: Chris Boumalhab <cboumalh@crcfe01.crc.nd.edu> | 30 April 2024, 17:09:09 UTC |
a1eb387 | Thanh Son Phung | 30 April 2024, 17:08:39 UTC | TaskVine: Fix logging of library deployment (#3780) * fix library logging * isolate fix | 30 April 2024, 17:08:39 UTC |
baaf00d | Benjamin Tovar | 29 April 2024, 14:41:40 UTC | vine, wq: version strings (#3793) * dttools: function to get version string * vine: add version to bindings * wq: add version to bindings * fix definition | 29 April 2024, 14:41:40 UTC |
55566d1 | Benjamin Tovar | 29 April 2024, 14:40:48 UTC | Worker id after random init (#3792) * use better source of low entropy as fallback for srand * vine: srand from random_init * wq: srand from random_init | 29 April 2024, 14:40:48 UTC |
14e0bef | Benjamin Tovar | 28 April 2024, 01:12:02 UTC | vine: small task fixes related to mem leaks (#3790) * vine: cache path mem leak in python bindings * vine: delete lib tasks on manager exit * vine: manager deletes lib tasks, not python bindings * vine: undeclare output file if a buffer in fun-calls | 28 April 2024, 01:12:02 UTC |
872de31 | Douglas Thain | 26 April 2024, 19:58:53 UTC | Vine: Add counters for performance and leak detection. (#3786) * Add internal counters for create/clone/delete of each object type. Added API to display on print/debug as needed. Always send to debug at completion of manager. * format * Added missing files. * Update vine_counters.c * Portable initialization requires nested braces. * Initializers again. * format * Always count a delete, even when removing a refcount * Set last task to None to remove dangling reference. * repetetive but keeps output on a single line * created/cloned/deleted to avoid conflict with C++ keyword | 26 April 2024, 19:58:53 UTC |
6423918 | Benjamin Tovar | 24 April 2024, 17:41:00 UTC | vine: fix small mem leak when adding task context (#3782) | 24 April 2024, 17:41:00 UTC |
e5a829f | Benjamin Tovar | 24 April 2024, 17:27:18 UTC | vine: add the extra input files to lib tasks (#3781) | 24 April 2024, 17:27:18 UTC |
51d90f0 | Benjamin Tovar | 24 April 2024, 00:43:22 UTC | vine: lib tasks coffea casa minor fixes (#3779) * vine: use cores as fallback for slots in dask executor * vine: set environment variables for libraries in dask executor * fix env var call | 24 April 2024, 00:43:22 UTC |
597c561 | Benjamin Tovar | 24 April 2024, 00:29:46 UTC | vine: use worker start time in usecs as rand seed (#3778) * vine: use worker start time in usecs as rand seed * wq: use worker start time in usecs as rand seed | 24 April 2024, 00:29:46 UTC |
d76a900 | Chris Boumalhab | 22 April 2024, 11:56:16 UTC | Vine: Recover temp files on worker removal (#3744) * minor fixes + code review * set up recovery post worker removal * remove worker from file_worker_table * final setup for monday * removed break * linting * some PR fixes * added vine tune functionality * typo * added vine_tune pt2 * linting * code refactor * linting * added valuable comment * linting * defined in header file * assigning initial value * pesky bug * changes to PR comments * lint * new setup in wait_internal * lint * temp files added to hashtable in cache update * removed unecessary if condition * removed commented break * fixes * added debug * PR comment fixes * added upper limit to replication * fixed debug stmnts * small fix --------- Co-authored-by: Chris Boumalhab <cboumalh@cclws17.cse.nd.edu> Co-authored-by: Chris Boumalhab <cboumalh@crcfe01.crc.nd.edu> | 22 April 2024, 11:56:16 UTC |
fd8cf41 | Benjamin Tovar | 17 April 2024, 20:40:40 UTC | rmonitor: update summary file as excutable runs with --update-summary (#3754) * rmon: move hostname after arg processing * rmon: avg in collate * rmon: snapshots directly to summary * rmon: --update-summary which writes summary file every interval * update manual * format | 17 April 2024, 20:40:40 UTC |
a933943 | Benjamin Tovar | 17 April 2024, 20:40:25 UTC | vine, wq: set a maximum number of task to run per category (#3759) * wq: move category task counts to change_task_state * wq: adds q.specify_category_max_concurrent("category", max) * vine: move category task counts to change_task_state * vine: adds m.specify_category_max_concurrent("category", max) | 17 April 2024, 20:40:25 UTC |
e5ed488 | Benjamin Tovar | 17 April 2024, 19:27:16 UTC | vine: Worker contact address (#3767) * vine worker: add --contact-address to set arbitrary peer server * vine: fix bug, function needs peer struct, not source name * vine: keep explicit transfer address * vine: rewrite f->source as needed * rename to --transfer-address for consistency * fix bugs: w->transfer_port_active not set and not checked * increase protocol version * vine: worker:// to workerip:// to actually to the rewrite... * adds address_is_valid_ip * addr to hostport, etc. | 17 April 2024, 19:27:16 UTC |
92b2c60 | Benjamin Tovar | 16 April 2024, 20:35:21 UTC | vine: adds vine_update_catalog to allow apps to force catalog update (#3765) | 16 April 2024, 20:35:21 UTC |
32cfa32 | Colin Thomas | 16 April 2024, 18:58:55 UTC | fixed_location flag check (#3766) | 16 April 2024, 18:58:55 UTC |