A repository with a source code for the paper: https://doi.org/10.1093/comnet/cnaf036
- Authors: Michał Czuba(¶†), Mingshan Jia(†), Piotr Bródka(¶†), Katarzyna Musial(†)
- Affiliation:
(¶) WUST, Wrocław, Lower Silesia, Poland
(†) UTS, Sydney, NSW, Australia
First, initialise the enviornment:
conda env create -f env/conda.yaml
conda activate infmax-mds-ltm-mln
python -m ipykernel install --user --name=infmax-mds-ltm-mlnTo use scripts which produce analysis, install the source code:
pip install -e .Dataset is stored with DVC. Thus, to obtain it you have to access a Google Drive. Please
send a request via e-mail (michal.czuba@pwr.edu.pl) to have it granted. Then, execute in shell:
dvc pull. zip files are large and we recommend to pull them only if necessary.
.
├── README.md
├── data
│ ├── brute_ds -> a brute force DS finder
│ ├── networks -> networks used in exmeriments
│ ├── processed_results
│ ├── processed_results_2nd
│ ├── raw_results
│ ├── raw_results_2nd
│ └── test -> examplary results of the simulator used in the E2E test
├── env -> a definition of the runtime environment
├── scripts
│ ├── analysis
│ └── configs -> exemplary configuration files
├── src -> scripts to execute experiments and process the results
├── run_experiments.py -> an entrypoint to trigger the pipeline to evaluate MDS in InfMax
├── test_reproducibility.py -> E2E test to prove that results can be repeatedFirst stage:
- batch_1 real-world, g-mds, AND
- batch_2 real-world, g-mds, OR
- batch_3 artificial, g-mds, AND
- batch_4 artificial, g-mds, OR
- batch_5 timik1q2009, g-mds, AND
- batch_6 timik1q2009, g-mds, OR
- batch_7 real-world, li-mds, AND
- batch_8 real-world, li-mds, OR
- batch_9 artificial, li-mds, AND
- batch_10 artificial, li-mds, OR
- batch_11 timik1q2009, li-mds, AND
- batch_12 timik1q2009, li-mds, OR
- batch_13 arxiv_netscience_coauthorship, g-mds, AND
- batch_14 arxiv_netscience_coauthorship, g-mds, OR
- batch_15 arxiv_netscience_coauthorship, li-mds, AND
- batch_16 arxiv_netscience_coauthorship, li-mds, OR
Second stage:
- var_actors, series_1: |A|=250
- var_actors, series_2: |A|=250
- var_actors, series_3: |A|=250
- var_actors, series_4: |A|=250
- var_actors, series_5: |A|=250
- var_hubs, series_1: m0=2
- var_hubs, series_2: m0=4
- var_hubs, series_3: m0=6
- var_hubs, series_4: m0=8
- var_hubs, series_5: m0=10
To run experiments execute: python run_experiments.py <config file>. See example_config.yaml for
inspirations. As a result, for each repetition of the cartesian product computed for the provided
parameters, a csv file will be obtained with following columns:
{
seed_ids: str # IDs of actors that were seeds aggr. into string (sep. by ;)
gain: float # gain* obtained using this seed set
simulation_length: int # nb. of simulation steps
seed_nb: int # nb. of actors that were seeds
exposed_nb: int # nb. of active actors at the end of the simulation
unexposed_nb: int # nb. of actors that remained inactive
expositons_rec: str # record of new activations in each epoch aggr. into string (sep. by ;)
network: str # network's name
protocol: str # protocols's name
seed_budget: float # value of the maximal seed budget
mi_value: float # value of the threshold
ss_method: str # seed selection method's name
}* Gain is the percentage of the non-initially seeded population that became exposed during the
simulation: (exposed_nb - seed_nb) / (total_actor_nb - seed_nb) * 100%
The simulator will also save provided configuraiton file, rankings of actors used in computations,
and detailed logs of evaluated cases whose index divided modulo by full_output_frequency equals 0.
Results are supposed to be fully reproducable. There is a test for that: test_reproducibility.py.
To process raw results please execute scripts in scripts/analysis directory in the order as
depicted in a following tree. Please note, that names of scripts reflect names of genreated files
under data/processed_results:
.
├── distr_expos.ipynb
├── quantitative_comparison.py
│ ├── effectiveness_heatmaps.py
│ └── profile_reports.py
├── metrics.py
├── similarities_mds.py
├── similarities_seeds.py
├── visualisations_mds.py
├── mds_algos_comparison.py
└── quantitative_comparison_2nd.pyThis work was supported by the National Science Centre, Poland [grant no. 2022/45/B/ST6/04145] (www.multispread.pwr.edu.pl); the Polish Ministry of Science and Higher Education programme “International Projects Co-Funded”; and the EU under the Horizon Europe [grant no. 101086321]. Views and opinions expressed are those of the authors and do not necessarily reflect those of the funding agencies.