Revision 613cbffb7e96f803f23141771593a583bb420279 authored by Pedro Orvalho on 28 April 2023, 12:39:23 UTC, committed by GitHub on 28 April 2023, 12:39:23 UTC
1 parent bab67fc
Raw File
Readme.txt
-- ARTIFACT LINK:

Public URL: https://arsr.inesc-id.pt/~pmorvalho/UpMax.zip

-- ADDITIONAL REQUIREMENTS:

Install Python dependencies and unzip all the data and solvers:

bash scripts/config.sh

This script installs PySAT and matplotlib and their dependencies. No internet connection is required.

-- EXPERIMENT RUNTIME:

Our dataset contains 2000 MaxSAT formulae: 1000 for the problem of minimum sum coloring and 1000 for the problem of seating assignment.
We evaluate 6 different partition schemes + WCNF formulae and 7 different MaxSAT algorithms in our experiments. Moreover, two MaxSAT algorithms (MaxHS and UWrMaxSAT) only accept WCNF.

So for each problem, we have 1000 instances.

The number of runs:
- Each instance can be partitioned in 6 different ways (6000) or not partitioned at all (1000).
- 5 algorithms accept partitioning and no partitioning 5x(6000+1000) = 35000
- 2 algorithms only accept wcnf: 2x(1000) = 2000

So for each problem, we have 37000 runs (35000+2000).

And we evaluate two MaxSAT problems (2x37000), so our experiments correspond to 74.000 (74k) different runs.

We used StarExec [40] with a timeout of 1800 seconds and a memory limit of 32 GB for each instance. Our experiments with a timeout of 1800s (30 min) took us ~52998314 seconds, i.e., 613 days if run sequentially. However, we run this using multiple machines in a cluster.

We propose to evaluate our artifact on a representative subset of our framework that contains 1% of our instances (10 for each problem, 740 runs). Furthermore, we propose to use a timeout of 180s (3min).


-- REPRODUCIBILITY INSTRUCTIONS:

The artifact is in the directory UpMax.
All the following commands must be run from the UpMax directory, which can be accessed using the command:

cd UpMax

******* OPTION #1 ******* 

- To run one instance:
(This should take around 1 minute to process)

bash run_all.sh representative_subset_1_instance 180

******* OPTION #2 ******* 

- To run the representative subset of our dataset 1% (10 instances of each MaxSAT problem): (OUR RECOMMENDATION) 
(This may take around 12/13 hours to process using the VM; StarExec took us around 10 hours of CPU time.)

bash run_all.sh representative_subset_10_instances 180

******* OPTION #3 ******* 

- To run the representative subset of our dataset 10% (100 instances of each MaxSAT problem):
(This may take around 8 days to process using the VM; StarExec took us around 7 days of CPU time.)

bash run_all.sh representative_subset_100_instances 180

******* OPTION #4 ******* 

- To run all benchmarks and reproduce the results section in the paper, use the script run_all.sh: (takes around 613 days if not run in StarExec).

unzip evaluation_dataset
./run_all.sh evaluation_dataset 1800

************************** 

This command will run all instances with a timeout of 1800 seconds (30min) and store the resulting logs in "results/logs/"

Although, as explained above, this artifact should be evaluated with a timeout of 180s and our representative subset unless the reviewers have access to StarExec or another cluster.


The run_all.sh script calls the other scripts:

- ./scripts/run_solvers.sh DATA_SET TIMEOUT_S

  This script runs all solvers with the representative subset of our evaluation dataset (passed as $1) and with a timeout of $2.
  The logs generated by this script are stored in 'results/logs/'
  
- ./scripts/read_logs.sh TIMEOUT_S

  This script processes all the logs and saves a CSV file for each pair algorithm-partition_scheme in 'results/csvs/'

- ./scripts/get_tables.sh TIMEOUT_S

  This script prints and save the tables used in our paper. The number of solved instances for each pair Algorithm x Partition_Scheme. The tables are stored in 'results/tables/'

- ./scripts/get_plots.sh

  This script generates the CSV files for the plots used in our paper using the mkplot tool.  The CSV files will be stored 'results/plots/'

- ./scripts/gen_plots.sh TIMEOUT_S

  Script to generate the plots we used in our paper. The plots will be saved in 'results/plots/'


--------------------      Table 1 (Section 4.1) and Table 2 (Section 4.2)      --------------------


The script run_all.sh will generate Table 1 and Table 2.

The CSV files for each table as stored in 'results/tables'.

Table 1: 'results/tables/minimum_sum_coloring.csv'

Table 2: 'results/tables/seating_assignment.csv'

      These tables can also be generated by running the following command:

      	    ./scripts/get_tables.sh TIMEOUT_S	   


-----------------   PLOTS (Section 4)   -----------------

The script run_all.sh will generate the CSV files for the plots.
    More specifically, this script calls the following scripts:
    	 ./scripts/get_plots.sh
	 ./scripts/gen_plots.sh 180
	 	   
Cactus plots:
 - Fig.6 - MSC Problem -
   	 CSV file:  'results/plots/minimum-sum-coloring/msc-cactus.csv'
	 plot (png) 'results/plots/minimum-sum-coloring/msc-cactus.png'

 - Fig.7 - Seating Assignment Problem -
   	 CSV file:  'results/plots/seating-assignment/seating-cactus.csv'
	 plot (png) 'results/plots/seating-assignment/seating-cactus.png'

Scatter plots:
 - Fig.8 (a) - MSC - OLL RES VS No Part
   	 CSV file:  'results/plots/minimum-sum-coloring/msc-scatter-OLL.csv'
	 plot (png) 'results/plots/minimum-sum-coloring/msc-scatter-OLL.png'

 - Fig.8 (b) - MSC - WBO RES VS No Part
   	 CSV file:  'results/plots/minimum-sum-coloring/msc-scatter-WBO.csv'
	 plot (png) 'results/plots/minimum-sum-coloring/msc-scatter-WBO.png'

 - Fig.8 (c) - SA - MSU3 - Table VS No Part.
   	 CSV file:  'results/plots/seating-assignment/seating-scatter-MSU3.csv'
	 plot (png) 'results/plots/seating-assignment/seating-scatter-MSU3.png'

 - Fig.8 (d) - SA - WBO - Tag VS No Part.
   	 CSV file:  'results/plots/seating-assignment/seating-scatter-WBO.csv'
	 plot (png) 'results/plots/seating-assignment/seating-scatter-WBO.png'


------------------------     ORIGINAL LOGS     ------------------------

The artifact also contains the logs from the run, whose results are shown in the paper in the zip: tacas23-logs.zip. Furthermore, 'tacas23-results-database.db' is a SQLite database built from our logs (note that SQLite is not currently available in the installation package). Since it is unlikely that the reviewers will be able to reproduce all the results (due to a large number of CPU computation hours required), making all the original logs available will clarify any questions the reviewers may have.

These results were obtained using StarExec [40] with a timeout of 1800 seconds and a memory limit of 32 GB for each instance. Since running the solvers in the VM is slower than in StarExec and the memory limit of the VM is only 8 GB, there may be some differences between running all the experiments in the VM when compared to StarExec. However, the same results trend should be observed in the VM.  

Our paper's plots and respective CSV files are in "our-results/".

To generate the plots:

cd our-results/
./gen_plots.sh

Although the matplotlib must be installed, as explained in the ADDITIONAL REQUIREMENTS section.

Furthermore, the artifact also contains the logs from the run just using the representative subset (representative_subset_10_instances) in the zip: tacas23-artifact-eval-logs.zip. These results were obtained using an Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz with a timeout of 180s and a memory limit of 8 Gb.

The source code for UpMax is also publicly available at https://github.com/forge-lab/upmax 
This repository should not be used to reproduce the artifact, but it contains the current version of the tool and will continue to be developed and shared with the research community.
back to top