Remote Job Runner#
The remote job runner feature in Unison allows users to remotely trigger a task in Unison that uses flat file for their input. This makes it easier to automate jobs without having to manually run the task and download the output.
For example, if there is new data to process, the file can be dropped off in a pickup directory and the remote job runner will use the file as an input and produce an output file in a specific directory once the jobs are complete. This process can be useful for ETL processing and frequent data processing.
Steps to enable remote job runner for Unison#
Make a pickup path.
sudo mkdir -p /path/to/pickup
Make a processing path.
sudo mkdir -p /path/to/processing
Get the deployment name since it’ll be unique for each host.
sudo kubectl get deployments -n unison | grep unison-api
Copy the name of the deployment value.
Note
For steps 4-7 the deployment name and the path will rely on steps 1-3, so pay attention to the values in the screenshots. The values have been highlighted to show the changes.
Patch the deployment for the dropoff location.
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"containers":[{"name":"unison-api","env":[{"name":"MONITOR_PATH","value":"/path/to/pickup"}]}]}}}}' -n unison
Patch the deployment for the processing location.
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"containers":[{"name":"unison-api","env":[{"name":"INPUT_PATH","value":"/path/to/processing"}]}]}}}}' -n unison
Patch the deployment for the volume mount.
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"volumes":[{"hostPath":{"path":"/path/to/pickup","type":""},"name":"pickup"}]}}}}' -n unison
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"volumes":[{"hostPath":{"path":"/path/to/processing","type":""},"name":"processing"}]}}}}' -n unison
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"containers":[{"name":"unison-api","volumeMounts":[{"mountPath":"/path/to/pickup","name":"pickup"}]}]}}}}' -n unison
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"containers":[{"name":"unison-api","volumeMounts":[{"mountPath":"/path/to/processing","name":"processing"}]}]}}}}' -n unison
Patch the deployment to turn on the file pickup feature.
sudo kubectl patch deployment unison-api-hostname -p '{"spec":{"template":{"spec":{"containers":[{"name":"unison-api","env":[{"name":"FILE_RUNNER_ENABLED","value":"true"}]}]}}}}' -n unison
Steps to Process job remotely#
Make a project in Unison. We strongly suggest making your project name all one word without spaces or underscore as these can cause issues with command line. Upload the input file that will be used for future remote execution and configure and save the project.
Make a directory for the project name that matches the project in Unison you want to execute remotely.
sudo mkdir -p /path/to/pickup/projectName
Drop off the file in the project directory from step 2, again spaces in file names can cause issues, it’s good to have this all one word, and please see the note below about filenames with underscore. Make sure the file content matches the same columns from the project input used.
cp fileName /path/to/pickup/projectName
Note
The filename cannot include underscores, this is a special character we use for internal processing, also it’s important to make this filename unique as the output file will be the same name and a unique name will avoid conflicts.
Wait for the file to be completed. If processing commences the file should vanish from the pickup location and when it’s done a new
Jobs_Completed
folder should appear in your project directory if it doesn’t exist already, and the output file will be placed in it with the same name as the file you dropped off.Note
When processing new data, make sure to either clear the output file in the
Jobs_Completed
folder or name the new data file with a unique name. Reusing the same file name for input will not overwrite an existing output file in theJobs_Complete
folder.