Skip to content

Commit 1e06a6d

Browse files
authored
Merge pull request #22 from vsoch/fix/paper
updating get_userhome functions and putting paper lines, one per line
2 parents 77938ee + 65c1f13 commit 1e06a6d

File tree

3 files changed

+52
-95
lines changed

3 files changed

+52
-95
lines changed

paper/paper.md

Lines changed: 41 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -17,104 +17,68 @@ bibliography: paper.bib
1717

1818
# Summary
1919

20-
[WatchMe](https://vsoch.github.io/watchme/) is a simple tool to allow for reproducibly watching for changes in one or more
21-
web pages, system resources, or any task function that is provided to the library.
22-
It addresses a problem in research that it's highly challenging to create and share
23-
reproducible tasks, meaning:
20+
[WatchMe](https://vsoch.github.io/watchme/) is a simple tool to allow for reproducibly watching for changes in one or more web pages, system resources, or any task function that is provided to the library.
21+
It addresses a problem in research that it's highly challenging to create and share reproducible tasks, meaning:
2422

2523
1. a configuration file (recipe) stores the parameters for tasks including a function to run, a frequency, and any other necessary variables
2624
2. the tasks are automatically run at some frequency
2725
3. the results of the runs are saved automatically via version control
2826
4. the results collected can be re-assembled into temporal data structures that are ready for analysis
2927
5. the entire base (configuration, tasks, and results) can be shared via GitHub, and reproduced by others
3028

31-
With WatchMe, a researcher can easily generate a repository (a watcher) that is configured
32-
to run one or more tasks at a particular frequency, and automatically commit changes to git.
33-
If he or she chooses, the repository can be pushed to a version control service like GitHub,
34-
and the entire configuration and set of tasks is easily reproducible by anyone that uses
35-
the client to get the repository. Each watcher uses git not only for version control of
36-
configuration files, but as a temporal database from which the results of the task runs can
37-
be extracted. Every change to a task within a watcher directory is also recorded via
38-
git, making the entire setup well documented with minimal to no work needed by the
39-
researcher.
29+
With WatchMe, a researcher can easily generate a repository (a watcher) that is configured to run one or more tasks at a particular frequency, and automatically commit changes to git.
30+
If he or she chooses, the repository can be pushed to a version control service like GitHub, and the entire configuration and set of tasks is easily reproducible by anyone that uses the client to get the repository. Each watcher uses git not only for version control of configuration files, but as a temporal database from which the results of the task runs can be extracted.
31+
Every change to a task within a watcher directory is also recorded via git, making the entire setup well documented with minimal to no work needed by the researcher.
4032

4133
## Background
4234

43-
Reproducible monitoring and data collection for an individual researcher is a challenging task. Typically,
44-
if a web page or system resource is to be monitored, the researcher must write custom scripts and
45-
extraction steps, and in the best case scenario, he or she uses version control for the scripts or final
46-
result. While many online services exist to watch for changes in one or more web pages
47-
(e.g., see https://visualping.io/ for an example service), these resources are problematic
48-
for research use. Specifically:
35+
Reproducible monitoring and data collection for an individual researcher is a challenging task.
36+
Typically, if a web page or system resource is to be monitored, the researcher must write custom scripts and extraction steps, and in the best case scenario, he or she uses version control for the scripts or final result.
37+
While many online services exist to watch for changes in one or more web pages (e.g., see https://visualping.io/ for an example service), these resources are problematic for research use. Specifically:
4938

5039
1. It's typically the case that you will be charged for more than a few pages
5140
2. It's not appropriate for a research setting where you would want programmatic parsing
5241
3. The configuration of your watcher is not reproducible.
5342

54-
Thus, WatchMe is ideal for the individual researcher that does not want to (or cannot)
55-
pay for a service, and wants to be able to share their monitoring tasks with other
56-
researchers, such as for a publication or similar. It also allows for collaborative data
57-
collection, as multiple users can run the equivalent task, have data exported named
58-
uniquely, and then submit a pull request to combine the data.
59-
43+
Thus, WatchMe is ideal for the individual researcher that does not want to (or cannot) pay for a service, and wants to be able to share their monitoring tasks with other researchers, such as for a publication or similar.
44+
It also allows for collaborative data collection, as multiple users can run the equivalent task, have data exported named uniquely, and then submit a pull request to combine the data.
6045

6146
## WatchMe Tasks
6247

63-
By default, WatchMe comes with two task types intended to provide general templates
64-
for creating specific monitoring tasks.
48+
By default, WatchMe comes with two task types intended to provide general templates for creating specific monitoring tasks.
6549

6650
### Web Tasks
6751

68-
It's a common need to want to retrieve content from the web, whether that be a request
69-
to get a page, a subset of a page, the download of a file, or a post to an application
70-
programming interface (API). These general tasks perform these operations, with customizations
71-
to control the url, how the response is parsed, headers and parameters, and the result written.
72-
For example, the general set of web tasks can be used to check a set of cities for changes to weather or climate,
73-
to monitor an API endpoint, track changes in prices of item(s) of interest,
74-
download a file at some frequency, or watch a job board for changes. For details about
75-
setup and usage, see the [urls tasks](https://vsoch.github.io/watchme/watchers/urls/)
76-
documentation.
77-
52+
It's a common need to want to retrieve content from the web, whether that be a request to get a page, a subset of a page, the download of a file, or a post to an application programming interface (API).
53+
These general tasks perform these operations, with customizations to control the url, how the response is parsed, headers and parameters, and the result written.
54+
For example, the general set of web tasks can be used to check a set of cities for changes to weather or climate, to monitor an API endpoint, track changes in prices of item(s) of interest, download a file at some frequency, or watch a job board for changes.
55+
For details about setup and usage, see the [urls tasks](https://vsoch.github.io/watchme/watchers/urls/) documentation.
7856

7957
### System Tasks
8058

81-
The psutils library of functions uses the Python Psutil [@psutil] set of functions
82-
to monitor system resources, sensors, and python environment. Given the naming of data outputs
83-
based on the host, if a second user forked the example repository
84-
and ran it on his or her host, he or she could open a pull request to contribute new data.
85-
Given the unique naming of each task file, the data could co-exist with previous
86-
data generated on other hosts. Given the common export formats,
87-
common analyses could be shared and run on the exports by the different users.
88-
See the [psutils tasks](https://vsoch.github.io/watchme/watchers/psutils/)
89-
documentation for details, and continue reading for a specific example.
59+
The psutils library of functions uses the Python Psutil [@psutil] set of functions to monitor system resources, sensors, and python environment. Given the naming of data outputs based on the host, if a second user forked the example repository and ran it on his or her host, he or she could open a pull request to contribute new data.
60+
Given the unique naming of each task file, the data could co-exist with previous data generated on other hosts.
61+
Given the common export formats, common analyses could be shared and run on the exports by the different users.
62+
See the [psutils tasks](https://vsoch.github.io/watchme/watchers/psutils/) documentation for details, and continue reading for a specific example.
9063
An example that uses the set of system tasks is discussed next.
9164

9265

9366
## Research Usage
9467

95-
The command line usage of watchme, along with making the tool programmatic,
96-
also makes it ideal for usage on research clusters, or custom usage within scripts.
97-
Importantly, WatchMe is able to take a repository of result files produced by one
98-
or more contributors, and export data structures that keep a record of timestamps,
99-
results, and commit ids for each addition of a results file.
68+
The command line usage of watchme, along with making the tool programmatic, also makes it ideal for usage on research clusters, or custom usage within scripts.
69+
Importantly, WatchMe is able to take a repository of result files produced by one or more contributors, and export data structures that keep a record of timestamps, results, and commit ids for each addition of a results file.
10070

10171
### Watcher Example
10272

103-
As an example, the repository [watchme-system](https://github.com/vsoch/watchme-system) runs a
104-
set of hourly tasks to measure the host networking, cpu and memory usage, sensors
105-
(battery, temperature, fans), and other user and python-specific data. After installing
106-
watchme, a second researcher could easily obtain the task by doing:
73+
As an example, the repository [watchme-system](https://github.com/vsoch/watchme-system) runs a set of hourly tasks to measure the host networking, cpu and memory usage, sensors (battery, temperature, fans), and other user and python-specific data.
74+
After installing watchme, a second researcher could easily obtain the task by doing:
10775

10876
```bash
10977
$ watchme get https://www.github.com/vsoch/watchme-system system
11078
```
11179

112-
The command above would clone the repository, check that it was a valid Watcher (indicated
113-
by presence of a configuration file named watchme.cfg) and then download the folder
114-
to a new watcher named "system" (the second argument) in the default Watchme base folder,
115-
located at $HOME/.watchme. The organization of any watcher is intuitive - the top level
116-
folder is the name for the watcher, and the folders inside that begin with "task-"
117-
represent the various task folders:
80+
The command above would clone the repository, check that it was a valid Watcher (indicated by presence of a configuration file named watchme.cfg) and then download the folder to a new watcher named "system" (the second argument) in the default Watchme base folder, located at $HOME/.watchme.
81+
The organization of any watcher is intuitive - the top level folder is the name for the watcher, and the folders inside that begin with "task-" represent the various task folders:
11882

11983
```bash
12084
/.watchme/system$ tree
@@ -145,14 +109,11 @@ represent the various task folders:
145109
└── watchme.cfg
146110
```
147111

148-
Notice that each task folder has a result file, along with a timestamp to indicate
149-
when the watcher was last run. The user can edit the watchme.cfg if desired, or simply
150-
activate and schedule the watcher (optionally disabling a subset of tasks) to run
151-
at some frequency (e.g., hourly) and commit to git. No further work is required
152-
by the researcher other than keeping the host machine turned on. The researcher can push the results
153-
to a GitHub repository (as was done in this case) and at any time, export the results
154-
for a particular result file. In the command below, we use the "watchme" client to export the watcher folder "system"
155-
for a task called "task-memory". We ask the watcher to parse the result content as json:
112+
Notice that each task folder has a result file, along with a timestamp to indicate when the watcher was last run.
113+
The user can edit the watchme.cfg if desired, or simply activate and schedule the watcher (optionally disabling a subset of tasks) to run at some frequency (e.g., hourly) and commit to git.
114+
No further work is required by the researcher other than keeping the host machine turned on.
115+
The researcher can push the results to a GitHub repository (as was done in this case) and at any time, export the results for a particular result file.
116+
In the command below, we use the "watchme" client to export the watcher folder "system" for a task called "task-memory". We ask the watcher to parse the result content as json:
156117

157118

158119
```bash
@@ -204,37 +165,28 @@ $ watchme export system task-memory vanessa-thinkpad-t460s_vanessa.json --json
204165
}
205166
```
206167

207-
While only two commits are shown in the result above, an actual export for
208-
this particular watcher has results for memory metrics collected on the hour.
209-
The researcher could then perform an [analysis](https://github.com/vsoch/watchme-system/blob/master/data/watchme-task-analysis.ipynb)
210-
using the data collected. As an example, here is a plot from such an analysis that
211-
tracks virtual memory usage of this author, recorded every hour,
212-
over two weekend days.
168+
While only two commits are shown in the result above, an actual export for this particular watcher has results for memory metrics collected on the hour.
169+
The researcher could then perform an [analysis](https://github.com/vsoch/watchme-system/blob/master/data/watchme-task-analysis.ipynb) using the data collected.
170+
As an example, here is a plot from such an analysis that tracks virtual memory usage of this author, recorded every hour, over two weekend days.
213171

214172
![img/virtual-memory-used.png](img/virtual-memory-used.png)
215173

216-
Interestingly, we can see a pattern that correlates with the activity of the author
217-
during the day. Virtual memory usage is low from the previous evening (1800 hours)
218-
through the early morning (0600 hours) and then rises sharply when the author starts
219-
to work. It goes down briefly in the early afternoon when the author pauses for a break,
220-
and picks up afterward, stopping when it's time for dinner. We see that the
221-
system's core temperature follows a similar trend:
174+
Interestingly, we can see a pattern that correlates with the activity of the author during the day. Virtual memory usage is low from the previous evening (1800 hours) through the early morning (0600 hours) and then rises sharply when the author starts to work.
175+
It goes down briefly in the early afternoon when the author pauses for a break, and picks up afterward, stopping when it's time for dinner.
176+
We see that the system's core temperature follows a similar trend:
222177

223178
![img/core-temp.png](img/core-temp.png)
224179

225180
We also see that the computer was briefly unplugged after the morning work session.
226181

227182
![img/battery.png](img/battery.png)
228183

184+
These kinds of metrics are interesting to answer research questions about system resources and behavior, and represent only the tip of the iceberg in terms of the scope of data that WatchMe could help collect.
185+
For example, WatchMe would have interesting use cases for monitoring resources or jobs for HPC, or watching for changes in any kind of web resource (prices, climate data, API endpoints, etc.).
186+
For other examples, see the [WatchMe Examples](https://vsoch.github.io/watchme/examples) page.
229187

230-
These kinds of metrics are interesting to answer research questions about system
231-
resources and behavior, and represent only the tip of the iceberg in terms of the
232-
scope of data that WatchMe could help collect. For example, WatchMe would have interesting
233-
use cases for monitoring resources or jobs for HPC, or watching for changes in any
234-
kind of web resource (prices, climate data, API endpoints, etc.) For other examples, see the
235-
[WatchMe Examples](https://vsoch.github.io/watchme/examples) page.
236-
237-
More information on WatchMe, including examples, information on watcher tasks, and function documentation is provided at the WatchMe <a href="https://vsoch.github.io/watchme/" target="_blank">documentation</a>. Others are encouraged to give feedback, ask questions, and request
188+
More information on WatchMe, including examples, information on watcher tasks, and function documentation is provided at the WatchMe <a href="https://vsoch.github.io/watchme/" target="_blank">documentation</a>.
189+
Others are encouraged to give feedback, ask questions, and request
238190
new task functions or examples on <a href="https://www.github.com/vsoch/watchme/issues" target="_blank">the issue board</a>.
239191

240192
# References

watchme/tests/test_utils.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
import tempfile
1212
import shutil
1313
import json
14+
from sys import platform
1415
import os
1516

1617

@@ -118,7 +119,11 @@ def test_userhome(self):
118119
from watchme.utils import get_user
119120
user = get_user()
120121
userhome = get_userhome()
121-
self.assertEqual('/home/%s' % user, userhome)
122+
print("Userhome is %s" % userhome)
123+
if platform.startswith("linux"):
124+
self.assertEqual('/home/%s' % user, userhome)
125+
elif platform == "darwin":
126+
self.assertEqual('/Users/%s' % user, userhome)
122127

123128
def test_files(self):
124129
print('Testing utils.generate_temporary_files')

watchme/utils/fileio.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,17 @@
1212
1313
'''
1414

15-
import configparser
1615
import errno
1716
import os
18-
import pwd
19-
import re
2017
import tempfile
2118
import json
2219
import io
2320
import socket
2421
import shutil
2522
import sys
23+
import getpass
24+
25+
2626

2727
from watchme.logger import bot
2828

@@ -31,11 +31,11 @@
3131
def get_userhome():
3232
'''get the user home based on the effective uid
3333
'''
34-
return pwd.getpwuid(os.getuid())[5]
34+
return os.path.expanduser("~")
3535

3636
def get_user():
3737
'''return the active user'''
38-
return os.path.basename(get_userhome())
38+
return getpass.getuser()
3939

4040
def get_host():
4141
'''return the hostname'''

0 commit comments

Comments
 (0)