Both Comet and Gordon-Simons and TSCC support MATLAB, a development environment from The MathWorks. Here you will find instructions and examples for running matlab jobs.
For example:
$ groups use124 matlab-users
If you are not in the matlab group, please send an email to:
for XSEDE users: help@xsede.org
for Gordon-Simons users: consult@sdsc.edu
for TSCC users: tscc-support@ucsd.edu
and ask to be added.
Matlab must be run on compute nodes of Comet or Gordon. There are three options as described below. Please do *NOT* run on the login nodes.
$ qsub -I -lnodes=1:ppn=16:native,walltime=00:30:00 -q normal
(b) Once the command in (a) runs you will be placed on a compute node. You can load the matlab module and run matlab as follows:
To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com.
Academic License
>> A = [1 3 0; 2 4 -1; 4 9 -1]
(b) Once the command in (a) runs you will be placed on a compute node. Keep this window open and then from a different terminal window directly ssh to the compute node. For example if you step (a) put you on comet-29-01, you can do:
$ ssh -X username@comet-29-01.sdsc.edu
At this point you can do the same steps as case I to load the matlab module and then launch matlab without any options:
This will launch the GUI window.
The script matlab.sb can be submitted as follows:
$ sbatch matlab.sb
(Assumes that UsingTimeSeriestoPredictEquityReturnExample.m is in your submit directory - can be copied from the examples directory)
Once the job runs (use the squeue command to check on its status), the output file will be created in the submit directory.
Both Comet and Gordon support Parallel MATLAB, a development environment from The MathWorks. Here you will find instructions and examples for running jobs with the MATLAB Parallel Computing Toolbox on a desktop and submitting them to the MATLAB Distributed Computing Server (MDCS).
Note: MATLAB configurations are very similar on both Comet and Gordon-Simons. Where this documentation refers to Gordon-Simons, you may substitute Comet without loss of accuracy.
Important: Use of MATLAB on Comet and Gordon-Simons is limited to users from degree-granting educational institutions. To use MATLAB on Comet or Gordon-Simons, request to have your account added to the matlab UNIX group by sending an email, submitting an XSEDE Help Desk ticket.
Parallel MATLAB consists of two parts: the PCT and the MDCS.
The PCT is a module that runs on the MATLAB client. It contains a number of useful capabilities, including:
Using the MDCS allows users to run multiprocessor jobs on Comet and Gordon-Simons via the batch queue system. The MDCS may be accessed from a user desktop with the PCT installed.
The PCT will automatically submit jobs to the MDCS (see below for details of this procedure).
The versions of MATLAB currently installed on SDSC HPC systems are:
Resource | Version | Location |
---|---|---|
Comet | 2016a | /opt/matlab/2016a |
Gordon-Simons | 2015b | /opt/matlab/2015b |
TSCC | 2015a | /opt/matlab/2015a |
Other versions may be available on some platforms. Contact SDSC Support with questions.
NOTE: The MATLAB version on the desktop must match the MDCS version. This includes both the year AND the letters 'a' or 'b'.
To use the desktop PCT with Comet or Gordon-Simons you must have:
Linux and Mac OS X have built in key generating programs as part of their default system environments, but Windows does not. One option for Windows users is to download PuTTY and use it to generate the key pairs. See the section Configuring Secure Shell on a Desktop for Use with MATLAB Parallel Computing Toolbox, which describes how to generate key pairs on your desktop and install them on Gordon-Simons.
See some examples of MATLAB usage to better understand the process.
MATLAB has unique setup requirements for users with a Parallel Computing Toolbox on their Windows desktop, compared with other desktop platforms. A secure shell is required to access the Comet or Gordon-Simons MATLAB server, and this must be installed if not already available. For Linux and Mac OS X, the default system shell will suffice.
When using the Comet- or Gordon-Simons-based toolbox and client, this desktop configuration is not required.
To access the Distributed Computing server on Comet or Gordon-Simons from a desktop system (in other words, to use the Parallel Computing Toolbox with a client not installed on Comet or Gordon-Simons), you must have a secure shell installed on the desktop. The setup process is different for Windows and Linux/Mac <#unix>.
ssh-keygen -i -f public_key.in > public_key.out
in your .ssh directoryUNIX systems (Linux, Mac OS X, etc.) have native implementations of ssh and scp, so when using a desktop MATLAB client, the only requirements are:
If you do not have an ssh keypair on your desktop, you can generate one as follows:
First create a parallel profile in MATLAB for the Comet or Gordon-Simons cluster. To do this:
Next, download the file archive which contains these files:
communicatingJobWrapper.sh communicatingSubmitFcn.m createSubmitScript.m deleteJobFcn.m extractJobId.m getJobStateFcn.m getRemoteConnection.m getSubmitString.m independentJobWrapper.sh independentSubmitFcn.m
Copy these files to the toolbox/local
directory of your local MATLAB installation. These are modified versions of the MATLAB files that come with the MATLAB release. These files allow you to specify several job parameters.
If you have used MDCS before on TSCC, Gordon-Simons or Trestles and are planning to run it on Comet, please update the files in your MATLAB installation under toolbox/local
by downloading the new archive linked above. The new files also work with both TORQUE and SLURM.
Using these modified files, you may also set:
Other parameters that need to be set include
Here is an example MATLAB function that creates a cluster object:
function [ cluster ] = getCluster(username,account,clusterHost,ppn,queue,time,DataLocation, RemoteDataLocation,keyfile,ClusterMatlabRoot) cluster = parcluster('GenericProfile1'); set(cluster,'HasSharedFilesystem',false); set(cluster,'JobStorageLocation',DataLocation); set(cluster,'OperatingSystem','unix'); set(cluster,'ClusterMatlabRoot',ClusterMatlabRoot); set(cluster,'IndependentSubmitFcn',{@independentSubmitFcn,clusterHost, RemoteDataLocation,account,username,keyfile,time,queue}); set(cluster,'CommunicatingSubmitFcn'{@communicatingSubmitFcn,clusterHost, RemoteDataLocation,account,username,keyfile,time,queue,ppn}); set(cluster,'GetJobStateFcn',{@getJobStateFcn,username,keyfile}); set(cluster,'DeleteJobFcn',{@deleteJobFcn,username,keyfile});
The following test function takes as its arguments all the parameters listed above, and returns a MATLAB cluster object that will be used to create an MDCS job.
Here is a MATLAB function used in a simple MDCS example job:
processors=32 clusterHost='gordon.sdsc.edu' ppn=16 username='jpg' account='use300' queue='normal' time='01:00:00' DataLocation='/Users/jpg/Documents/MATLAB/data' RemoteDataLocation='/home/jpg/matlab/data' keyfile='/Users/jpg/.ssh/id_rsa' matlabRoot='/opt/matlab/2013b' cluster = getCluster(username,account,clusterHost,ppn,queue,time,DataLocation, RemoteDataLocation,keyfile,matlabRoot); j = createCommunicatingJob(cluster); j.AttachedFiles={'testparfor2.m'}; set(j,'NumWorkersRange',[1 processors]); set(j,'Name','Test'); t = createTask(j,@testparfor2,1,{processors}); submit(j); wait(j); pause(30); o=j.fetchOutputs; o{:}
In this example, a cluster object (cluster) is returned by getCluster(), which is passed to createCommunicatingJob(), which returns a job object. The files that are required on the cluster are defined, as well as the number of processors (32 in this case). A task is created that will call the function testparfor2() which has one output argument and one input argument with the value 32. The job is then submitted, and the output is stored in a MATLAB cell array.
Here is the function that is submitted to run:
function a = testparfor2(N) a = zeros(N,1); parfor(i=1:N) feature getpid a(i) = ans end
In this simple example, an array of dimension N is initialized with zeros, and the process ID is written to each array element. Since in this case we have passed the number 32 to N and we have asked for 32 processors, we might expect to get 32 different processes to run the tasks.
However, when we examine our output array:
ans = 27892 24571 27880 24572 24576 24573 27893 24578 24587 27877 24583 24586 27884 27876 27881 24585 24581 24582 27889 27887 27879 24574 27883 27890 24588 27886 24580 24575 27888 27891 27878 27880
we can see that only 31 worker processes were used to perform the job. Two of the loop passes were performed by the same process (27880). The reason is that MATLAB uses one worker to run the serial code, allocating the remaining workers to perform the parallel functions. Since this only leaves 31 available workers, one of them must handle two loop iterations.
The current implementation of Matlab on Comet generates a folder with the name "JobX" where X is the current job number which is automatically generated sequentially. This folder contains several files for each node or "Task".
The results of a parallel Matlab job submitted to Comet in this way, such as the array shown above, are located in the Task1.out.mat file in the 'argsout' variable. Techically, the argsout variable is a 1x1 array, but the single cell can contain a multidimensional array.
Download a zip archive containing example files for this example.
To run the above example, open the file 'run_testparfor1.m' in your local Matlab client (this example uses Windows) and run it. You will of course need to modify some of the parameters such as file paths, usernames, etc. to conform to your account on Comet or Gordon-Simons.
Example 2: FFT with external file using Parallel Matlab
This is a very simple example that includes an external audio file for which we will calculate the FFT on a sequence of time series epochs. The included file is an audio file but a similar process can be performed on any time series data. Here is the basic code which takes the file 'audio_sample.wav' as input:
function parfor_fft = parfor_fft1(N)
x = audioread('audio_sample.wav');
step = 1000;
epoch = 256;
maxinc = 17287;
parfor_fft = zeros(maxinc,256);
parfor k = 1:1:maxinc
X = fft(x(k*step:(k*step)+epoch-1),256);
XX = X.*conj(X)/256;
parfor_fft(k,:) = XX;
end
Download a zip archive containing example files for this example.
To run the above example, open the file 'run_parfor_fft1.m' in your local Matlab client (this example uses Windows) and run it. You will of course need to modify some of the parameters such as file paths, usernames, etc. to conform to your account on Comet or Gordon-Simons.