3ad04be | John Vivian | 02 February 2017, 01:31:18 UTC | Refactor SRA pipeline to use faster method than fastq-dump Pull SRA data from FTP and convert locally Run cutadapt directly skipping unnecessary pre-processing step in rna-seq pipeline | 02 February 2017, 01:31:18 UTC |
7843cb5 | John Vivian | 20 January 2017, 09:40:19 UTC | Add additional fastq-dump parameters | 20 January 2017, 09:40:19 UTC |
cd867db | John Vivian | 02 January 2017, 00:01:43 UTC | Add SRA manifest | 02 January 2017, 00:01:43 UTC |
4f2f094 | John Vivian | 02 January 2017, 00:01:21 UTC | Add gz flag to config attributes | 02 January 2017, 00:01:21 UTC |
5a3ea3d | John Vivian | 02 January 2017, 00:01:06 UTC | Fix path for globbed fastqs | 02 January 2017, 00:01:06 UTC |
cf154da | John Vivian | 02 January 2017, 00:00:42 UTC | Make output dir for failed samples: "failed-samples" | 02 January 2017, 00:00:42 UTC |
854cac8 | John Vivian | 02 January 2017, 00:00:08 UTC | Add cores attribute | 02 January 2017, 00:00:08 UTC |
42b429f | John Vivian | 01 January 2017, 23:59:14 UTC | Manifest partitions | 01 January 2017, 23:59:14 UTC |
69fec21 | John Vivian | 01 January 2017, 04:45:34 UTC | Example config | 01 January 2017, 04:45:34 UTC |
27204ff | John Vivian | 01 January 2017, 04:45:20 UTC | Initial pipeline commit | 01 January 2017, 04:45:20 UTC |
a34c861 | John Vivian | 01 January 2017, 03:15:45 UTC | Initial commit for SRA-CGL-RNASeq pipeline | 01 January 2017, 03:15:45 UTC |
2ea61ea | John Vivian | 20 December 2016, 20:33:55 UTC | Add encryption to upload | 20 December 2016, 20:33:55 UTC |
a9997da | John Vivian | 20 December 2016, 19:13:48 UTC | Short script for packaging / transferring beatAML data | 20 December 2016, 19:13:48 UTC |
30c440f | John Vivian | 13 December 2016, 13:39:34 UTC | serial gzip of fastqs | 13 December 2016, 13:39:34 UTC |
955c281 | John Vivian | 13 December 2016, 11:20:43 UTC | Replaced start due to module loading issue Confirm still an issue to issue 1000 jobs from one child? 20000? | 13 December 2016, 11:20:43 UTC |
4877c9e | John Vivian | 13 December 2016, 08:34:26 UTC | Handy collection of files for creating test inputs | 13 December 2016, 08:34:26 UTC |
c2ef039 | John Vivian | 13 December 2016, 08:34:02 UTC | Committing to get out of my git history | 13 December 2016, 08:34:02 UTC |
2dcd1f5 | John Vivian | 13 December 2016, 08:32:32 UTC | Correct comment typo | 13 December 2016, 08:32:32 UTC |
5db70a9 | John Vivian | 13 December 2016, 08:32:21 UTC | Cython example code | 13 December 2016, 08:32:21 UTC |
8284631 | John Vivian | 13 December 2016, 08:31:49 UTC | Finish process and upload step PEP | 13 December 2016, 08:31:49 UTC |
131c513 | John Vivian | 13 December 2016, 07:53:22 UTC | initial commit | 13 December 2016, 07:53:22 UTC |
14fb2d2 | John Vivian | 25 May 2016, 22:47:25 UTC | Fix bad split, skip existing files. | 25 May 2016, 22:47:25 UTC |
aabf387 | John Vivian | 25 May 2016, 21:50:52 UTC | Clarified key path | 25 May 2016, 21:50:52 UTC |
4f22cc0 | John Vivian | 25 May 2016, 21:50:37 UTC | Python hello world | 25 May 2016, 21:50:37 UTC |
338e8db | John Vivian | 25 May 2016, 21:50:19 UTC | For re-encrypting data using per-file keys derived from a master | 25 May 2016, 21:50:19 UTC |
fd51b68 | John Vivian | 25 May 2016, 21:48:49 UTC | Generate signed URL for SSEC downloads | 25 May 2016, 21:48:49 UTC |
4522d02 | John Vivian | 25 May 2016, 21:48:13 UTC | Convert back to boto2 | 25 May 2016, 21:48:13 UTC |
447195b | John Vivian | 25 May 2016, 21:47:41 UTC | Initial idea for jenkins.py for toil-scripts | 25 May 2016, 21:47:41 UTC |
3951b22 | John Vivian | 25 May 2016, 21:47:00 UTC | Script for uploading to Ceph If boto credentials not setup appropriately | 25 May 2016, 21:47:00 UTC |
cb3a4cb | John Vivian | 25 May 2016, 21:46:24 UTC | Defuckifing the results of your paper before submitting it is a good idea! | 25 May 2016, 21:46:24 UTC |
306a9ec | John Vivian | 23 March 2016, 01:34:26 UTC | Delete SDB artifacts | 23 March 2016, 01:34:26 UTC |
8cc97c2 | John Vivian | 23 March 2016, 00:55:05 UTC | Functionalized id retrieval instead of slicing | 23 March 2016, 00:55:05 UTC |
5d0a92f | John Vivian | 23 March 2016, 00:54:40 UTC | Made start_time and end_time optional | 23 March 2016, 00:54:40 UTC |
ca412e6 | John Vivian | 23 March 2016, 00:54:26 UTC | Compacted get_instance_ids | 23 March 2016, 00:54:26 UTC |
b6b1fe8 | John Vivian | 23 March 2016, 00:53:54 UTC | gitignore for pyc | 23 March 2016, 00:53:54 UTC |
3918676 | John Vivian | 22 March 2016, 23:50:38 UTC | fixed ridiculous os.path.join bug | 22 March 2016, 23:50:38 UTC |
7a2c68e | John Vivian | 22 March 2016, 20:54:15 UTC | bug fixes | 22 March 2016, 20:54:15 UTC |
ab98362 | John Vivian | 22 March 2016, 20:43:47 UTC | My version of the upload directory to s3 script | 22 March 2016, 20:43:47 UTC |
478a570 | John Vivian | 02 March 2016, 07:01:51 UTC | Modified documentation | 02 March 2016, 07:01:51 UTC |
56c1ce0 | John Vivian | 28 February 2016, 17:22:56 UTC | Fixed invocation of pipeline for restart, wiggle, and save_bams | 28 February 2016, 17:22:56 UTC |
b2e97b3 | John Vivian | 25 February 2016, 23:00:21 UTC | Renamed as no longer for scaling tests Script with sub parsers for: - Creating config for scaling tests - `create-config` - Launching a cluster (with cgcloud) - `launch-cluster` - Launching a pipeline - `launch-pipeline` - Real time metric collection `launch-metrics` | 25 February 2016, 23:00:21 UTC |
5d8ad00 | John Vivian | 25 February 2016, 22:57:29 UTC | Script to generate metric plots and estimate cost Given a directory of metrics produced from `launch-metrics` of the automated_scaling_tests script, produce a plot of metrics and estimate costs. | 25 February 2016, 22:57:29 UTC |
08fce62 | John Vivian | 25 February 2016, 22:51:42 UTC | Added `--share` to create-config options | 25 February 2016, 22:51:42 UTC |
b577f7b | John Vivian | 23 February 2016, 20:08:42 UTC | Stupid typo | 23 February 2016, 20:08:42 UTC |
5581b70 | John Vivian | 23 February 2016, 20:07:35 UTC | Made saving wiggle and bams optional via cmd line arguments | 23 February 2016, 20:07:35 UTC |
f792666 | John Vivian | 21 February 2016, 17:16:17 UTC | Added "Zone" as option to launch-cluster CGCloud now requires a `--zone`. | 21 February 2016, 17:16:17 UTC |
24cc3b3 | John Vivian | 17 February 2016, 23:41:38 UTC | Merge pull request #6 from arkal/master No longer have parallely running instances of s3am | 17 February 2016, 23:41:38 UTC |
ed276d1 | arkal | 17 February 2016, 22:36:13 UTC | No longer have parallelly running instances of s3am. | 17 February 2016, 22:36:13 UTC |
d0f7a4b | John Vivian | 14 February 2016, 19:18:41 UTC | Metric collection and instance termination now more aggressive | 14 February 2016, 19:18:41 UTC |
a8d60de | John Vivian | 11 February 2016, 22:53:36 UTC | Peppy | 11 February 2016, 22:53:36 UTC |
91b89ac | John Vivian | 11 February 2016, 22:52:23 UTC | Modularized "Uber script" Added timestamps to logging Added sub parsers to each "part" of the program Pipeline now launched via a screen | 11 February 2016, 22:52:23 UTC |
a816bf5 | John Vivian | 10 February 2016, 02:48:42 UTC | Merge pull request #5 from arkal/master encrypt_files_in_dir_to_s3.py now universally uses --sse-key-base64 | 10 February 2016, 02:48:42 UTC |
5e5c807 | John Vivian | 10 February 2016, 02:46:08 UTC | Added timestamp to logging. Fixed critical error where a comprehension was run before checking if it contained anything. Don't worry, definitely didn't fail right at the start of the large recompute project our lab spent a month preparing for. Removed call to modify launch script since it was deprecated. | 10 February 2016, 02:46:08 UTC |
6d8007a | John Vivian | 09 February 2016, 00:45:51 UTC | YAMR: Yet Another Massive Refactor Certain variables pulled out as top-level vars for development Removed launch script editing, directly call pipeline. Pipeline is now run a second time with `--restart` if exits with non-zero status code. Remove alarms, instances now directly terminated via boto. Datapoints are stored as named tuples, now raw dumped to a file with no processing. | 09 February 2016, 00:45:51 UTC |
e3361ab | John Vivian | 03 February 2016, 17:19:34 UTC | Used function 'id' in place of str 'instance_id' | 03 February 2016, 17:19:34 UTC |
b0b21d8 | John Vivian | 03 February 2016, 09:45:09 UTC | Complete refactor Metrics now collected and raw dumped to a file in real time (1 hour intervals) Workers are now killed during metric collection when idle. Removed alarm application in lieu of `boto.terminate_instances()` Replaced cost calculations with "Max" costs to simulate if entire cluster were running. Will have to develop a new method to analyze costs given an average hourly cost and the generated metrics array that represents time | 03 February 2016, 09:45:09 UTC |
75ff824 | John Vivian | 03 February 2016, 05:58:42 UTC | ensure state is running | 03 February 2016, 05:58:42 UTC |
35858e9 | John Vivian | 01 February 2016, 20:09:30 UTC | Added try/except block in case of failure collecting cost values. | 01 February 2016, 20:09:30 UTC |
a215da6 | John Vivian | 01 February 2016, 17:23:16 UTC | Simplified output Added try/except block for pipeline launch. | 01 February 2016, 17:23:16 UTC |
3283aaa | John Vivian | 31 January 2016, 00:12:56 UTC | Added backoff for metric collection | 31 January 2016, 00:12:56 UTC |
9eb0676 | John Vivian | 31 January 2016, 00:11:55 UTC | Improved blocking. Added backoff for alarm application Made run_report more robust. Fixed type error. Output log.txt on leader | 31 January 2016, 00:11:55 UTC |
c4572ac | John Vivian | 28 January 2016, 08:28:16 UTC | Added TypeError in block_workers() for bizarre boto auth failure Collect metrics before killing workers in case metric collection takes too long. | 28 January 2016, 08:28:16 UTC |
8802f9b | John Vivian | 27 January 2016, 06:16:39 UTC | Removed all plotting / pruning. Added 'Paging' for long running instances collect_metrics accepts time.time() floats for start and stop fixed collection period to be 5 min | 27 January 2016, 06:16:39 UTC |
b20dc35 | John Vivian | 27 January 2016, 05:51:40 UTC | Refactored to handle metric collection refactor | 27 January 2016, 05:51:40 UTC |
f19ee2b | John Vivian | 25 January 2016, 19:38:01 UTC | Added precautions to avoid preemptive shutdown | 25 January 2016, 19:38:01 UTC |
d07c7de | John Vivian | 23 January 2016, 18:38:02 UTC | Added date to folder where run_report.txt is written. | 23 January 2016, 18:38:02 UTC |
0e2109c | John Vivian | 22 January 2016, 22:15:23 UTC | 50 char limit on s3 buckets (and no underscores). | 22 January 2016, 22:15:23 UTC |
5b6a62d | John Vivian | 22 January 2016, 17:56:21 UTC | cleanup | 22 January 2016, 17:56:21 UTC |
57c2603 | John Vivian | 22 January 2016, 17:20:40 UTC | Added date to s3_dir, add try/except block on blocking function in case instance goes down. | 22 January 2016, 17:20:40 UTC |
c344aa4 | John Vivian | 22 January 2016, 17:19:45 UTC | Added doctoring | 22 January 2016, 17:19:45 UTC |
c84f498 | John Vivian | 22 January 2016, 17:03:11 UTC | Added generalized function for applying alarms to an instance | 22 January 2016, 17:03:11 UTC |
cfbf988 | John Vivian | 22 January 2016, 03:31:47 UTC | Moved buffer time back to 15 minutes. | 22 January 2016, 03:31:47 UTC |
f61a60c | John Vivian | 22 January 2016, 03:30:13 UTC | Add uuid for consistency with automation pipeline | 22 January 2016, 03:30:13 UTC |
665a57c | John Vivian | 21 January 2016, 18:21:08 UTC | Kill leader, standard UUID, log output. | 21 January 2016, 18:21:08 UTC |
6943441 | John Vivian | 21 January 2016, 18:20:18 UTC | structure change for automated scaling tests | 21 January 2016, 18:20:18 UTC |
df9c438 | John Vivian | 21 January 2016, 18:19:41 UTC | Added avail zone to boto_lib | 21 January 2016, 18:19:41 UTC |
15ed4b0 | John Vivian | 19 January 2016, 17:08:21 UTC | Changed alarm and termination mechanism Instances are periodically checked for low CPU usage. Once all instances at <1 CPU for 15 minutes, apply "insta-kill" alarm that terminates all workers. | 19 January 2016, 17:08:21 UTC |
98900d0 | John Vivian | 19 January 2016, 15:41:14 UTC | Removed parallelization (AWS doesn't support) added try/except block for instances that don't return metrics. These are subsequently removed from the instances pool. | 19 January 2016, 15:41:14 UTC |
4748f11 | John Vivian | 18 January 2016, 07:30:10 UTC | Automated pipeline for scalings tests for Toil recompute | 18 January 2016, 07:30:10 UTC |
ae5acf8 | John Vivian | 18 January 2016, 07:29:37 UTC | Made primary plotting function generic to any metric | 18 January 2016, 07:29:37 UTC |
f57b31c | John Vivian | 11 January 2016, 22:23:06 UTC | Improved plots for aggregate metric data | 11 January 2016, 22:23:06 UTC |
577b289 | John Vivian | 11 January 2016, 22:22:36 UTC | library of boto functions | 11 January 2016, 22:22:36 UTC |
1592519 | John Vivian | 07 January 2016, 04:52:02 UTC | vertical plot of cpu, disk, and networking | 07 January 2016, 04:52:02 UTC |
1e7ba3b | John Vivian | 30 December 2015, 21:03:19 UTC | Code refactor for more accurate avg pricing and total pricing. | 30 December 2015, 21:03:19 UTC |
563c9bd | John Vivian | 24 December 2015, 06:42:59 UTC | Now returns an answer if instance is actively running | 24 December 2015, 06:42:59 UTC |
f4e6f00 | John Vivian | 15 December 2015, 22:36:53 UTC | Help menu improvements | 15 December 2015, 22:36:53 UTC |
dfe3ec5 | John Vivian | 15 December 2015, 20:27:33 UTC | args fix | 15 December 2015, 20:27:33 UTC |
0de4c2e | arkal | 08 December 2015, 22:21:24 UTC | Master key is now an optional argument. Running without master key will transfer to S3 BUCKET without encryption. | 08 December 2015, 22:21:24 UTC |
4f49bc5 | John Vivian | 08 December 2015, 18:29:29 UTC | Minor adjustments | 08 December 2015, 18:29:29 UTC |
b951828 | John Vivian | 08 December 2015, 18:28:22 UTC | pack static values, fix spacing in main() docstring | 08 December 2015, 18:28:22 UTC |
58a6d3b | John Vivian | 08 December 2015, 18:17:45 UTC | PEP8 compliance | 08 December 2015, 18:17:45 UTC |
bc2c043 | John Vivian | 08 December 2015, 18:04:10 UTC | Actually works now! Hoorah | 08 December 2015, 18:04:10 UTC |
109e904 | John Vivian | 08 December 2015, 08:19:26 UTC | Calculates the ec2 spot instance cost given instanceID and instanceType Needs availability zone specification | 08 December 2015, 08:19:26 UTC |
622f148 | arkal | 17 November 2015, 23:07:44 UTC | encrypt_files_in_dir_to_s3.py now universally uses --sse-key-base64 as a s3am argument. | 17 November 2015, 23:07:44 UTC |
44bcdb7 | John Vivian | 17 November 2015, 23:01:32 UTC | Merge pull request #4 from arkal/master Fixed to use the correct remote s3 url if -R is provided | 17 November 2015, 23:01:32 UTC |
42c5174 | arkal | 17 November 2015, 22:19:58 UTC | Fixed to use the correct remote s3 url if -R is provided | 17 November 2015, 22:19:58 UTC |
c1e7a4f | John Vivian | 17 November 2015, 19:19:20 UTC | Merge pull request #3 from arkal/master Added clause to handle quotes in the key. | 17 November 2015, 19:19:20 UTC |
f850f95 | arkal | 17 November 2015, 18:16:59 UTC | Added clause to handle quotes in the key. Such keys will be passed as files since they will corrupt the list of strings passed to popen otherwise. | 17 November 2015, 18:20:42 UTC |
5b8cd66 | John Vivian | 06 November 2015, 23:52:52 UTC | Merge pull request #2 from arkal/master Refactored encrypt_files_in_dir_to_s3.py | 06 November 2015, 23:52:52 UTC |
79e1841 | arkal | 05 November 2015, 23:24:54 UTC | Refactored encrypt_files_in_dir_to_s3.py to have a main function, arguments parsed through argparse, and now the script accepts multiple files, a folder of a files, and even a folder with subfolders. encrypt_files_in_dir_to_s3.py also now attempts to pass the key itsef to s3am instead of writing it out somewhere. However, if the key starts with a - character, the key is written to a temp directory and the key file is passed to s3am. The directory is deleted on exit. Pylinted for PEP8 compliance. | 06 November 2015, 22:46:32 UTC |