- I call a script that writes out a CSV for each bird containing all the predictor variables and the response
- The next script reads that CSV to fit the model.
As far as I can tell, by the way the DAG command is setup, it's pretty good at finding things recursively in folders. The "shared" folder is set apart for use by the program. So, it seems to go through the expected folder structure and figure out where everything you reference is, probably somehow taking note of the paths. Thus, unless something is passed as an argument in your script, it should be able to find it easily. In my case, I was able to copy all my folders with the data that would be incorporated, etc. into the shared folder and the paths to the right places were found. In other words, practically speaking, you can take out "setwd" in your scripts, and any references to "hard" file locations. It knows where your starting location is when it runs over Condor.
As you can imagine, we are beginners, and there is a lot these applications can do. Thankfully, we have a good and communicative person to help us get started. For example, I learned how to string together dependent workflows. In the example I've been talking about, this means I can automate step 1 leading into step 2 by making my own DAG. Follow the basic instructions for making a DAG for step 1, and then in its output folder, put your shared folder for step 2. Then, make a DAG also for that like you normally would (this time, the input folder is the output of the last). Then, you can splice them (see section 22.214.171.124 of that link). Here was the example given to me...
SPLICE job1 mydag.dag DIR step1_out
SPLICE job2 mydag.dag DIR step2_out
PARENT job1 CHILD job2
To move everything later...
for subdir in perspp/*; do cp --parents $subdir/bird.csv input; done